Highly parallel discrete cosine transform engine

ABSTRACT

A discrete cosine transform engine which receives an input matrix of data and provides a transformed matrix of data, the input matrix and the output matrix each having a plurality of row locations and a plurality of column locations. The engine includes a plurality of input accumulators, a plurality of multiplication circuits and a plurality of output accumulators. The plurality of accumulators accumulate data from the input matrix of data in parallel to provide a plurality of transform coefficient outputs. The plurality of multiplication circuits receive the plurality of coefficient outputs and multiply the coefficient outputs by transform constants to provide a plurality of transform products. Each output accumulator receives the transform products and accumulates the products to provide the transformed matrix of data.

BACKGROUND OF THE INVENTION

The present invention relates to discrete cosine transforms, and more particularly, to an engine for performing discrete cosine transforms.

It is desirable to compress image signals which are used with computer systems as an image signal for a single uncompressed high resolution digitized color image can easily fill several megabytes of storage. Because images generally have low information content, very good compression rates are possible. This is especially true if, as is often the case with image signals which are used with computer systems, perfect reproduction is not required. When perfect reproduction is not required, the low frequency components of a frequency domain image signal are perceptually more important in reproducing the image than the high frequency components of the frequency domain image signal. Thus, compression schemes that are applied to the frequency domain version of an image signal do not waste unnecessary bits in accurately representing the high frequency portions of the image signal.

Accordingly, it is desirable to transform an image signal from the spacial domain to the frequency domain prior to compressing the image signal. One type of transform that is desirable when transforming image signals is the discrete cosine transform (DCT). Examples of compression schemes which use a DCT include a Joint Photographics Experts Group standard (JPEG), a Moving Pictures Expert Group standard (MPEG), and a Consulting Committee for International Telegraphy and Telephony standard (CCITTH.261). A DCT is similar to a Fourier transform; an 8×8 DCT can be computed as a 16×16 Fourier transform. A DCT includes a pair of transforms, a forward DCT (FDCT), which maps a digitized signal to a frequency domain signal, and an inverse DCT (IDCT), which maps a frequency domain signal to a digitized signal.

It is known to perform DCT's in software on a personal computer to compress and decompress an image signal. When transforming a color image signal compared to a monochrome image signal, the amount of computation required increases by a factor of three. Additionally, the amount of computation required to transform an image increases as the square of the size of the screen of the computer.

When compressing an image signal, it is desirable to perform the DCTs quickly as compressing an image signal requires many DCTs to be performed. For example, to perform a JPEG compression of a 1024 by 1024 pixel color image requires 49,152 8×8 DCT's. If 30 images are compressed or decompressed every second, as is suggested to provide full motion video, then a DCT must be performed every 678 nsec. The calculation of a single 8×8 DCT (using the standard definition of a DCT transform) requires more that 9200 multiplications and more than 4000 additions.

It is known to perform DCTs on a matrix of data by first transforming the rows of data and then convolving the transformed rows to provide a transformed matrix of data. Such a system is relatively slow as the system includes the extra step of convolving the transformed rows of data.

SUMMARY OF THE INVENTION

It has been discovered that by dividing a DCT into a plurality of functions which may be performed in parallel and then performing the functions in parallel, a DCT engine may be provided which quickly and efficiently performs discrete cosine transforms on matrices of data.

In one embodiment, the invention relates to providing a plurality of input accumulator circuits which, based upon subsets of input data, provide transform coefficients, a plurality of multipliers which calculate a set of product values based upon the transform coefficients and a set of output accumulator circuits which accumulate subsets of the transform products to provide a matrix of transformed data.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a block diagram of a DCT engine in accordance with the present invention.

FIG. 2 shows an example of a matrix of data on which a DCT is performed.

FIG. 3 shows a block diagram of a second DCT engine in accordance with the present invention.

FIG. 4 shows an accumulator circuit of the FIG. 3 DCT engine.

DETAILED DESCRIPTION OF THE INVENTION

The following sets forth a detailed description of the best contemplated mode for carrying out the invention. The description is intended to be illustrative of the invention and should not be taken to be limiting.

Referring to FIG. 1, a DCT engine is shown for transforming 8×8 matrices F and f, where F is the FDCT of f and f is the IDCT of F. DCT engine 20 includes a plurality of input accumulator circuits 22, a plurality of multiplier circuits 24 and a plurality of output accumulator circuits 26. Input accumulator circuits 22 receive input data from input bus 30 and provide transform coefficient data to multiplier circuits 24 via sum bus 32. Sum bus 32 is arranged as six equivalent buses, Sbus(0)-Sbus(5), which are interconnected by a cross-bar switch to permit any sum bus to receive output transform coefficient data from any adder and any multiplier circuit to receive transform coefficient data from any sum bus. By providing a plurality of equivalent buses, the buses can accommodate a high volume of data in parallel without contention. Multiplier circuits 24 provide transform product data to output accumulator circuits 26 via product bus 34. Product bus 34 is arranged as seven equivalent buses, Pbus(0)-Pbus(6), which are interconnected by a cross-bar switch to permit: any product bus to receive output product data from any multiplier circuit and to permit any product bus to provide product data to any output accumulator circuit. Output accumulator circuits 26 provide output data to output bus 36.

Input accumulator circuits 22 are arranged in an 8×8 matrix. I.e., there are eight sets of accumulator circuits that are connected eight in parallel between input bus 30 and sum bus 32. For notation purposes, these accumulator circuits are referred to as IAdd(L), where L indicates the location of the adder circuit in the matrix which is indexed to correspond to a singly indexed vector. For example, IAdd(0) indicates the input accumulator circuit which is located at row 0, column 0 of the matrix and IAdd(63) indicates the input accumulator circuit which is located at row 7, column 7 of the matrix. Each input accumulation circuit 22 provides a corresponding transform coefficient.

Multiplier circuits 24 are arranged as a 2×8 matrix. I.e., there are eight sets of multiplier circuits that are connected two in parallel between sum bus 32 and product bus 34. For notation purposes, these accumulator circuits are referred to as Cmult(k) and Rmult(k), where k is a number from 0-7. The multiplier circuits denoted Cmult(k) perform a multiplication with one of eight different c constants and the multiplier circuits denoted Rmult(k) perform a multiplication with one of eight different r constants. Because, as discussed below, there are 16 constants by which the sum data is multiplied, providing 16 multiplier circuits allows the multiplication by the constants to be performed in parallel, with each multiplier circuit multiplying by a particular constant. Because, multiplier circuits 24 multiply by fixed constants, multiplier circuits 24 are configured as look-up tables. By configuring multiplier circuits 24 as look-up tables, multiplier circuits 24 can perform the multiplications in a single cycle. Each multiplier circuits 24 includes local storage to hold the products of the multiplications until one of the product buses is available. Additionally, multiplier circuits 24 include input storage because it is possible that the respective circuits 24 may receive two inputs in a single cycle.

Output accumulator circuits 26 are arranged as an 8×8 matrix. I.e., there are eight sets of output accumulator circuits that are connected eight in parallel between product bus 34 and output bus 36. For notation purposes, these accumulator circuits are referred to as OAdd(L), where L indicates the location of the accumulator circuit in the matrix which is indexed to correspond to a singly indexed vector. For example, OAdd(0) indicates the accumulator circuit which is located at row 0, column 0 of the matrix and OAdd(63) indicates the accumulator circuit which is located at location row 7, column 7 of the matrix. Each output accumulator circuit provides a corresponding location of the output data to output bus 36.

DCT engine 20 performs a DCT on the 8×8 matrices F and f. In the 8×8 spacial matrix F, i and j represent the coordinates for the matrix locations within the spacial matrix and in the 8×8 frequency matrix f, u and v are coordinates for the matrix locations within the frequency matrix. The forward DCT of the spacial matrix is set forth by the following equation. ##EQU1## The inverse DCT for the frequency matrix is set forth by the equation. ##EQU2## By defining the cosine constant c_(i) as ##EQU3## the trigonometric identity

    2·c.sub.i ·c.sub.j =c.sub.1+j +c.sub.i-j

is provided. By substituting this identity into the formulas F and f, the following equations result. ##EQU4## By formulating the transforms in this way, c_(i) takes on only sixteen different values, c₀ to c₇ and -c₀ to -c₇. Note that ##EQU5##

By manipulating the DCT formulas so that the cosine constants are isolated, it is possible to use highly parallel accumulations when calculating the FDCT and the IDCT. Additionally, by grouping functions which use the same constant terms, it is possible to perform highly parallel multiplications using the constant terms. The forward and inverse DCTs are dealt with separately as set forth by the following equations, where σ is the forward DCT coefficient vector and T is the inverse DCT coefficient vector.

    F.sub.uv =Σ.sub.k σ.sub.k ·c.sub.k

    f.sub.ij =Σ.sub.k τ.sub.k ·c.sub.k

The computation of the FDCT is set forth as follows. Notation for the matrix of data is indexed as a singly indexed vector as represented by the subscripts of X in FIG. 2. The first step in the FDCT is calculating the forward transform coefficient vectors σ₀ to σ₆₃ as follows. ##EQU6##

After the forward transform coefficient vectors are calculated, these transform coefficient vectors are used to generate the FDCT frequency matrix as follows, where r_(i) =c.sub. * c₄.

F₀ =c₀ ·σ₆₃

F₁ =r₁ ·σ₅₆ +r₃ ·σ₅₅ +r₅ ·σ₅₄ +r₇ ·σ₅₃

F₂ =r₂ ·σ₆₀ +r₆ ·σ₅₉

F₃ =-r₁ ·σ₅₄ +r₃ ·σ₅₆ -r₅ ·σ₅₃ -r₇ ·σ₅₅

F₄ =r₄ ·σ₆₂

F₅ =-r₁ ·σ₅₅ +r₃ ·σ₅₃ +r₅ ·σ₅₆ +r₇ ·σ₅₄

F₆ =-r₂ ·σ₅₉ +r₆ ·σ₆₀

F₇ =-r₁ ·σ₅₃ +r₃ ·σ₅₄ -r₅ ·σ₅₅ +r₇ ·σ₅₆

F₈ =r₁ ·σ₅₂ +r₃ ·σ₅₁ +r₅ ·σ₅₀ +r₇ ·σ₄₉

F₉ =c₀ ·σ₀ +c₂ ·σ₁₃ +c₄ ·σ₁₄ +c₆ ·σ₁₅

F₁₀ =c₁ ·σ₁₆ +c₃ ·σ₁₇ +c₅ ·σ₁₈ +c₇ ·σ₁₉

F₁₁ =c₀ ·σ₁ +c₂ ·σ₂₁ +c₄ ·σ₂₂ +c₆ ·σ₂₃

F₁₂ =c₁ ·σ₄ +c₃ ·σ₇ +c₅ ·σ₆ +c₇ ·σ₅

F₁₃ =c₀ ·σ₂ +c₂ ·σ₂₉ +c₄ ·σ₃₀ +c₆ ·σ₃₁

F₁₄ =c₁ ·σ₃₂ +c₃ ·σ₃₃ +c₅ ·σ₃₄ +c₇ ·σ₃₅

F₁₅ =c₀ ·σ₃ +c₂ ·σ₃₇ +c₄ ·σ₃₈ +c₆ ·σ₄₀

F₁₆ =r₂ ·σ₅₈ +r₆ ·σ₅₇

F₁₇ =c₁ ·σ₁₂ +c₃ ·σ₄₁ +c₅ ·σ₃₉ +c₇ ·σ₂₀

F₁₈ =c₀ ·σ₂₄ +c₄ ·σ₄₈

F₁₉ =c₁ ·σ₂₅ +c₃ ·σ₂₆ +c₅ ·σ₂₇ +c₇ ·σ₂₈

F₂₀ =c₂ ·σ₄₆ +c₆ ·σ₄₇

F₂₁ =-c₁ ·σ₂₈ +c₃ ·σ₂₇ -c₅ ·σ₂₆ +c₇ ·σ₂₅

F₂₂ =c₀ ·σ₃₆ +c₄ ·σ₄₃

F₂₃ =c₁ ·σ₂₀ -c₃ ·σ₃₉ +c₅ ·σ₄₁ -c₇ ·σ₁₂

F₂₄ =-r₁ ·σ₅₀ +r₃ ·σ₅₂ -r₅ ·σ₄₉ -r₇ ·σ₅₁

F₂₅ =-c₀ ·σ₂ +c₂ ·σ₃₁ +c₄ ·σ₃₀ -c₆ ·σ₂₉

F₂₆ =c₁ ·σ₃₄ -c₃ ·σ₃₂ +c₅ ·σ₃₅ +c₇ ·σ₃₃

F₂₇ =c₀ ·σ₀ -c₂ ·σ₁₅ -c₄ ·σ₁₄ +c₆ ·σ₁₃

F₂₈ =c₁ ·σ₆ -c₃ ·σ₄ +c₅ ·σ₅ +c₇ ·σ₇

F₂₉ =-c₀ ·σ₃ +c₂ ·σ₄₀ +c₄ ·σ₃₈ -c₆ ·σ₃₇

F₃₀ =-c₁ ·σ₁₈ +c₃ ·σ₁₆ -c₅ ·σ₁₉ -c₇ ·σ₁₇

F₃₁ =-c₀ ·σ₁ +c₂ ·σ₂₃ +c₄ ·σ₂₂ -c₆ ·σ₂₁

F₃₂ =r₄ ·σ₆₁

F₃₃ =c₁ ·σ₈ +c₃ ·σ₁₀ +c₅ ·σ₁₁ +c₇ ·σ₉

F₃₄ =c₂ ·σ₄₅ +c₆ ·σ₄₄

F₃₅ =c₁ ·σ₁₁ -c₃ ·σ₈ +c₅ ·σ₉ +c₇ ·σ₁₀

F₃₆ =c₀ ·σ₄₂

F₃₇ =c₁ !σ₁₀ -c₃ ·σ₉ -c₅ ·σ₈ -c₇ ·σ₁₁

F₃₈ =c₂ ·σ₄₄ -c₆ ·σ₄₅

F₃₉ =-c₁ ·σ₉ +c₃ ·σ₁₁ -c₅ ·σ₁₀ +c₇ ·σ₈

F₄₀ =-r₁ ·σ₅₁ +r₃ ·σ₄₉ +r₅ ·σ₅₂ +r₇ ·σ₅₀

F₄₁ =-c₀ ·σ₁ -c₂ ·σ₂₃ +c₄ ·σ₂₂ +c₆ ·σ₂₁

F₄₂ =-c₁ ·σ₃₃ +c₃ ·σ₃₅ +c₅ ·σ₃₂ +c₇ ·σ₃₄

F₄₃ =c₀ ·σ₃ +c₂ ·σ₄₀ -c₄ ·σ₃₈ -c₆ ·σ₃₇

F₄₄ =c₁ ·σ₇ -c₃ ·σ₅ -c₅ `σ₄ -c₇ ·σ₆

F₄₅ =c₀ ·σ₀ +c₂ ·σ₁₅ -c₄ ·σ₁₄ -c₆ ·σ₁₃

F₄₆ =c₁ ·σ₁₇ -c₃ ·σ₁₉ -c₅ ·σ₁₆ -c₇ ·σ₁₈

F₄₇ =c₀ ·σ₂ +c₂ ·σ₃₁ -c₄ ·σ₃₀ -c₆ ·σ₂₉

F₄₈ =-r₂ ·σ₅₇ +r₆ ·σ₅₈

F₄₉ =-c₁ ·σ₂₆ +c₃ ·σ₂₈ +c₅ ·σ₂₅ +c₇ ·σ₂₇

F₅₀ =-c₀ ·σ₃₆ +c₄ ·σ₄₃

F₅₁ =-c₁ ·σ₃₉ +c₃ ·σ₁₂ -c₅ ·σ₂₀ -c₇ -σ₄₁

F₅₂ =c₂ ·σ₄₇ -c₆ ·σ₄₆

F₅₃ =c₁ ·σ₄₁ -c₃ ·σ₂₀ -c₅ ·σ₁₂ -c₇ ·σ₃₉

F₅₄ =c₀ ·σ₂₄ -c₄ ·σ₄₈

F₅₅ =c₁ ·σ₂₇ -c₃ ·σ₂₅ +c₅ ·σ₂₈ +c₇ ·σ₂₆

F₅₆ =-r₁ ·σ₄₉ +r₃ ·σ₅₀ -r₅ ·σ₅₁ +r₇ ·σ₅₂

F₅₇ =-c₀ ·σ₃ +c₂ ·σ₃₇ -c₄ ·σ₃₈ +c₆ ·σ₄₀

F₅₈ =c₁ ·σ₁₉ -c₃ ·σ₁₈ +c₅ ·σ₁₇ -c₇ ·σ₁₆

F₅₉ =c₀ ·σ₂ -c₂ ·σ₂₉ +c₄ ·σ₃₀ -c₆ ·σ₃₁

F₆₀ =-c₁ ·σ₅ +c₃ ·σ₆ -c₅ ·σ₇ +c₇ ·σ₄

F₆₁ =-c₀ ·σ₁ +c₂ ·σ₂₁ -c₄ ·σ₂₂ +c₆ ·σ₂₃

F₆₂ =c₁ ·σ₃₅ -c₃ ·σ₃₄ +c₅ ·σ₃₃ -c₇ ·σ₃₂

F₆₃ =c₀ ·σ₀ -c₂ ·σ₁₃ +c₄ ·σ₁₄ -c₆ ·σ₁₅

More specifically, a spacial matrix of data elements is provided to forward accumulators 22, which calculate the forward coefficient vectors, the matrix of multiplier circuits 24 multiply the forward coefficient vectors by the constants c₀ to c₇ and r₀ to r₇ to generate transform products. Output accumulators 26 accumulate the transform products and provide a frequency matrix of data elements which is the forward DCT of the input spacial data matrix.

DCT engine 20 receives two input data elements and provides two output data elements with each clock signal. The input data elements are ordered by rows left to right across the data matrix. However, the output data elements are ordered diagonally in conformance with the above-mentioned JPEG specifications. The arrowed line in FIG. 2 shows the order in which the output data elements are provided.

DCT engine 20 rises input accumulators 22, multipliers 24 and output accumulators 26 to perform the many calculations of a FDCT in parallel. These calculations are divided into discrete calculations which are performed during different cycles. It takes a total of 82 cycles to completely transform an 8×8 matrix from the cycle when the first location of input data is provided to DCT engine 20 until cycle when the last location of transformed data is output by DCT engine 20. Individual input accumulators 22 and output accumulators 26 maintain running totals as each new entry is accumulated during a cycle. After the value which an accumulator is calculating is complete, that value is provided to the next stage of DCT engine 20. The individual cycles of the FDCT are now discussed in detail.

During cycle 1, f₁ and f₀ are provided to the input bus. Additionally, IAdd(0), IAdd(6), IAdd(7), IAdd(10), IAdd(11), IAdd(12), IAdd(13), IAdd(16), IAdd(17), IAdd (21), IAdd (22), IAdd (24) IAdd (25), IAdd (27), IAdd (30), IAdd (31), IAdd (34), IAdd (35), IAdd (40), IAdd (41), IAdd (42), IAdd (43), IAdd (44), IAdd (45), IAdd (46), IAdd (47), IAdd (48), IAdd (52), IAdd (56), IAdd(58), IAdd(60), IAdd(61), IAdd(62) and IAdd(63) all add f₀, IAdd(8), IAdd(9), IAdd(12), IAdd(13), IAdd(14), IAdd(18), IAdd(19), IAdd(28), IAdd(39), IAdd(45), IAdd(48), IAdd(52) IAdd(55), IAdd(58), IAdd(59), IAdd(61) and IAdd(63) all add f₁, and IAdd(2), IAdd(6), IAdd (7), IAdd (23), IAdd (27), IAdd (29), IAdd (32), IAdd(33), IAdd(36), IAdd(38), IAdd(40), IAdd(42) IAdd(43), IAdd(44), IAdd(46), IAdd(47) and IAdd(62) all subtract f₁.

During cycle 2, f₃ and f₂ are provided to the input bus. Additionally, IAdd(8), IAdd(14), IAdd(15), IAdd(20), IAdd(31), IAdd(32), IAdd(33), IAdd(36), IAdd (37), IAdd (38) , IAdd (41), IAdd (43) IAdd (44), IAdd(52), IAdd(54), IAdd(58), IAdd(61) and IAdd(63) all add f₂, IAdd(1), IAdd(6), IAdd(7), IAdd(9), IAdd(18), IAdd(19), IAdd(21), IAdd(25), IAdd(26), IAdd(42), IAdd (45), IAdd (46) IAdd (47), IAdd (48), IAdd (59) and IAdd(62) all subtract f₂, IAdd(6), IAdd(7), IAdd(10), IAdd (15), IAdd (29), IAdd (30), IAdd (39), IAdd (42), IAdd(46), IAdd(47), IAdd(52), IAdd(53) IAdd(58), IAdd(61), IAdd(62) and IAdd(63) all add f₃, and IAdd(3), IAdd(11), IAdd(16), IAdd(17), IAdd(20), IAdd(22), IAdd (23), IAdd (24), IAdd (26), IAdd (28), IAdd (34), IAdd (35) IAdd (37), IAdd (43), IAdd (44), IAdd (45), IAdd(48) and IAdd(60) all subtract f₃.

During cycle 3, f₅ and f₄ are provided to the input bus. Additionally, IAdd (3), IAdd (6), IAdd (7), IAdd (11), IAdd (20), IAdd (22), IAdd (23), IAdd (26), IAdd (28), IAdd (37), IAdd (42), IAdd (46), IAdd (47), IAdd (52), IAdd (58), IAdd (61), IAdd (62) and IAdd (63) all add f₄, IAdd (10), IAdd(15), IAdd (16), IAdd (17), IAdd (24), IAdd (29), IAdd(30), IAdd(34), IAdd(35), IAdd(39), IAdd (43), IAdd (44), IAdd (45), IAdd (48), IAdd (53), and IAdd (60) all subtract f₄, IAdd (1), IAdd (9), IAdd(21), IAdd(25), IAdd (26), IAdd (32), IAdd (33), IAdd (36), IAdd (43), IAdd (44), IAdd (52), IAdd (58), IAdd (61) and IAdd (63) all add f₅, and IAdd (6), IAdd (7), IAdd(8), IAdd (14), IAdd (15), IAdd(18), IAdd(19), IAdd (20), IAdd (31), IAdd (37), IAdd (38), IAdd (41), IAdd (42), IAdd(45), IAdd(46), IAdd(47), IAdd(48), IAdd (54), IAdd (59) and IAdd(62) all subtract f₅.

During cycle 4, f₇ and f₆ are provided to the input bus. Additionally, IAdd(2), IAdd(18, IAdd(19), IAdd (23), IAdd (27), IAdd(29), IAdd(38), IAdd(40), IAdd (45), IAdd (48), IAdd (52), IAdd (58), IAdd (59), IAdd (61) and IAdd (63) all add f₆, IAdd(6), IAdd(7), IAdd (8), IAdd(9), IAdd(12), IAdd(13), IAdd(14), IAdd (28), IAdd (32), IAdd (33), IAdd (36), IAdd (39), IAdd (42), IAdd(43), IAdd(44), IAdd(46), IAdd(47), IAdd (55) and IAdd(62) all subtract f₆, IAdd(6), IAdd(7), IAdd (16), IAdd (17), IAdd (24), IAdd (34), IAdd (35), IAdd (42), IAdd (43), IAdd (44), IAdd (45), IAdd (46), IAdd (47), IAdd (48), IAdd (52), IAdd (58), IAdd (60), IAdd (61), IAdd (62) and IAdd (63) all add f₇, and IAdd (0), IAdd (10), IAdd(11), IAdd(12), IAdd(13), IAdd(21), IAdd (22), IAdd(25), IAdd(27), IAdd(30), IAdd(31), IAdd (40), IAdd(41) and IAdd(56) all subtract f₇.

During cycle 5, f₈ and f₉ are provided to the input bus. Additionally IAdd (1), IAdd (4), IAdd (5), IAdd (13), IAdd (14), IAdd (16), IAdd (18), IAdd (20), IAdd (23), IAdd(26), IAdd(29), IAdd(33), IAdd(36), IAdd(38), IAdd(39), IAdd(46), IAdd(48), IAdd(51), IAdd(56), IAdd(57), IAdd(60), IAdd(62) and IAdd(63) all add f₈, IAdd (10), IAdd (11), IAdd (28), IAdd (35), IAdd (40), IAdd (42), IAdd (43), IAdd (44), IAdd (45), IAdd (47) and IAdd(61) all subtract f₈, IAdd(0), IAdd(15), IAdd(17), IAdd (23), IAdd (24), IAdd (26), IAdd (41), IAdd (42), IAdd (44), IAdd (47), IAdd (51), IAdd (55), IAdd (57), IAdd(59) and IAdd(63) all add f₉, and IAdd(4), IAdd(5), IAdd(8), IAdd(9), IAdd(19), IAdd(20), IAdd(22), IAdd(25), IAdd(29), IAdd(30), IAdd(32), IAdd(34), IAdd (37), IAdd (43), IAdd (45), IAdd (46), IAdd (48), IAdd(61) and IAdd(62) all subtract f₉.

During cycle 6, f₁₀ and f₁₁ are provided to the input bus. Additionally, IAdd (3), IAdd (9), IAdd (12), IAdd(13), IAdd(19), IAdd(30), IAdd(32), IAdd(34), IAdd (40), IAdd (42), IAdd (43), IAdd (45), IAdd (47), IAdd(48), IAdd(51), IAdd(54), IAdd(57) and IAdd(63) all add f₁₀, IAdd(4), IAdd(5), IAdd(8), IAdd(17), IAdd(21), IAdd (22), IAdd (24), IAdd (27), IAdd (28), IAdd (31), IAdd (39), IAdd (44), IAdd (46), IAdd (59), IAdd (61) and IAdd(62) all subtract f₁₀, IAdd(2), IAdd(4), IAdd(5), IAdd(11), IAdd(12), IAdd(14), IAdd(27), IAdd(31), IAdd (35), IAdd (43), IAdd (44), IAdd (45), IAdd (46), IAdd(51), IAdd(53), IAdd(57), IAdd(62) and IAdd(63) all add f₁₁, IAdd(10), IAdd(15), IAdd(16), IAdd(18), IAdd (21), IAdd (25), IAdd (33), IAdd (36), IAdd (37), IAdd (38), IAdd (41), IAdd (42), IAdd (47), IAdd (48), IAdd(60) and IAdd(61) all subtract f₁₁.

During cycle 7, f₁₂ and f₁₃ are provided to the input bus. Additionally, IAdd(4), IAdd(5), IAdd(10), IAdd(15), IAdd(21), IAdd(25), IAdd(35), IAdd(37), IAdd(38), IAdd(41), IAdd(43), IAdd(44), IAdd(45), IAdd(46), IAdd(51), IAdd(57), IAdd(62) and IAdd(63) all add f₁₂, IAdd(2), IAdd(11), IAdd(12), IAdd(14), IAdd(16), IAdd(18), IAdd(27), IAdd(31), IAdd(33), IAdd(36), IAdd(42), IAdd(47), IAdd(48), IAdd(53), IAdd(60) and IAdd(61) all subtract f₁₂, IAdd(8), IAdd(19), IAdd(21), IAdd(22), IAdd(27), IAdd(28), IAdd(31), IAdd(32), IAdd (34), IAdd (39), IAdd (42), IAdd (43), IAdd (45), IAdd(47), IAdd(48), IAdd(51), IAdd(57) and IAdd(63) all add f₁₃, IAdd(3), IAdd(4), IAdd(5), IAdd(9), IAdd(12), IAdd(13), IAdd(17), IAdd(24), IAdd(30), IAdd(40), IAdd(44), IAdd(46), IAdd(54), IAdd(59), IAdd(61) and IAdd (62) all subtract f₁₃.

During cycle 8, f₁₄ and f₁₅ are provided to the input bus. Additionally, IAdd (8), IAdd (9), IAdd (17), IAdd (20), IAdd (22), IAdd (24), IAdd (25), IAdd (29), IAdd (30), IAdd (37), IAdd (42), IAdd (44), IAdd (47), IAdd(51), IAdd(57), IAdd(59) and IAdd(63) all add f₁₄, IAdd(0), IAdd(4), IAdd(5), IAdd(15), IAdd(19), IAdd(23), IAdd (26), IAdd (32), IAdd (34), IAdd (41), IAdd (43), IAdd(45), IAdd(46), IAdd(48), IAdd(55), IAdd(61) and IAdd(62) all subtract f₁₄, IAdd(4), IAdd(5), IAdd(10), IAdd (11), IAdd (16), IAdd (18), IAdd (28), IAdd (33), IAdd (36), IAdd (40), IAdd (46), IAdd (48), IAdd (51), IAdd(57), IAdd(60), IAdd(62) and IAdd(63) all add f₁₅, and IAdd(1), IAdd(13), IAdd(14), IAdd(20), IAdd(23), IAdd(26), IAdd(29), IAdd(35), IAdd(38), IAdd(39), IAdd (42), IAdd (43), IAdd (44), IAdd (45), IAdd (47), IAdd(56) and IAdd(61) all subtract f₁₅.

During cycle 9, f₁₆ and f₁₇ are provided to the input bus. Additionally, IAdd(2), IAdd(4), IAdd(14), IAdd(15), IAdd(17), IAdd(19), IAdd(21), IAdd(28), IAdd(32), IAdd(37), IAdd(43), IAdd(47), IAdd(50), IAdd(56), IAdd(60), IAdd(62) and IAdd(63) all add f₁₆, IAdd (5), IAdd (10), IAdd (11), IAdd (20), IAdd (26), IAdd(31), IAdd(34), IAdd(36), IAdd(38), IAdd(39), IAdd (42), IAdd (44), IAdd (45), IAdd (46), IAdd (48), IAdd(57) and IAdd(61) all subtract f₁₆, IAdd(5), IAdd (13), IAdd (16), IAdd (20), IAdd (22), IAdd (25), IAdd (40), IAdd (42), IAdd (43), IAdd (44), IAdd (46), IAdd(48), IAdd(50), IAdd(55), IAdd(59) and IAdd(63) all add f₁₇, and IAdd (3), IAdd (4), IAdd (8), IAdd (9), IAdd (18), IAdd (21), IAdd (24), IAdd (26), IAdd (30), IAdd (31), IAdd (33), IAdd (35), IAdd (41), IAdd (45), IAdd(47), IAdd(57), IAdd(61) and IAdd(62) all subtract f₁₇.

During cycle 10, f₁₈ and f₁₉ are provided to the input bus. Additionally, IAdd (0), IAdd (5), IAdd (9), IAdd(18), IAdd(24), IAdd(27), IAdd(28), IAdd(29), IAdd (33), IAdd (35), IAdd (37), IAdd (39), IAdd (42), IAdd(45) IAdd(46), IAdd(50), IAdd(54) and IAdd(63) all add f₁₈, IAdd(4), IAdd(8), IAdd(12), IAdd(15), IAdd(16), IAdd(22), IAdd(23), IAdd(30), IAdd(43), IAdd(44), IAdd(47), IAdd(48), IAdd(57), IAdd(59), IAdd(61) and IAdd (62) all subtract f₁₈, IAdd (4), IAdd (11), IAdd (13), IAdd (23), IAdd (25), IAdd (29), IAdd (34), IAdd (36), IAdd (41), IAdd (44), IAdd (45), IAdd (47), IAdd (48), IAdd(50), IAdd(53), IAdd(62) and IAdd(63) all add f₁₉, and IAdd (1), IAdd (5), IAdd (10), IAdd (12), IAdd (14), IAdd(17), IAdd(19), IAdd(27), IAdd(32), IAdd(38), IAdd (40), IAdd (42), IAdd (43), IAdd (46), IAdd (57), IAdd(60) and IAdd(61) all subtract f₁₉.

During cycle 11, f₂₀ and f₂₁ are provided to the input bus. Additionally, IAdd (4), IAdd (10), IAdd (12), IAdd (14), IAdd (27), IAdd (34), IAdd (36), IAdd (38), IAdd (40), IAdd (44), IAdd (45), IAdd (47), IAdd (48), IAdd(50), IAdd(62) and IAdd(63) all add f₂₀, IAdd(5), IAdd(11), IAdd(13), IAdd(17), IAdd(19), IAdd(23), IAdd (25), IAdd (29), IAdd (32), IAdd (41), IAdd (42), IAdd(43), IAdd(46), IAdd(53), IAdd(57), IAdd(60) and IAdd(61) all subtract f₂₀, IAdd(5), IAdd(8), IAdd(12), IAdd(15), IAdd(18), IAdd(22), IAdd(23), IAdd(24), IAdd (3 0), IAdd (33), IAdd (35), IAdd (42), IAdd (45), IAdd(46), IAdd(50) and IAdd(63) all add f₂₁, and IAdd(0), IAdd(4), IAdd(9), IAdd(16), IAdd(27), IAdd(28), IAdd (29), IAdd (37), IAdd (39), IAdd (43), IAdd (44), IAdd(47), IAdd(48), IAdd(54), IAdd(57), IAdd(59), IAdd (61) and IAdd (62) all subtract f₂₁.

During cycle 12, f₂₂ and f₂₃ are provided to the input bus. Additionally, IAdd (3), IAdd (5), IAdd (8), IAdd (9), IAdd (16), IAdd (21), IAdd (26), IAdd (30), IAdd (31), IAdd (41), IAdd (42), IAdd (43), IAdd (44), IAdd(46), IAdd(48), IAdd(50), IAdd(59) and IAdd(63) all add f₂₂, IAdd (4), IAdd (13), IAdd (18), IAdd (20), IAdd (22), IAdd (24), IAdd (25), IAdd (33), IAdd (35), IAdd (40), IAdd(45), IAdd(47), IAdd(55), IAdd(57), IAdd(61) and IAdd (62) all subtract f₂₂, IAdd (4), IAdd (10), IAdd (11), IAdd(17), IAdd(19), IAdd(20), IAdd(26), IAdd(31), IAdd((32), IAdd(38), IAdd(39), IAdd(43), IAdd(47), IAdd(50), IAdd(60), IAdd(62) and IAdd(63) all add f₂₃, and IAdd (2), IAdd (5), IAdd (14), IAdd (15), IAdd (21), IAdd (28), IAdd (34), IAdd (36), IAdd (37), IAdd (42), IAdd (44), IAdd (45), IAdd (46), IAdd (48), IAdd (56), IAdd(57) and IAdd(61) all subtract f₂₃.

During cycle 13, f₂₄ and f₂₅ are provided to the input bus. Additionally, IAdd(3), IAdd(7), IAdd(10), IAdd(11), IAdd(15), IAdd(18), IAdd(22), IAdd(29), IAdd(32), IAdd(42), IAdd(44), IAdd(45), IAdd(49), IAdd(56), IAdd(60), IAdd(61), IAdd(62) and IAdd(63) all add f₂₄, IAdd(6), IAdd(12), IAdd(19), IAdd(23), IAdd(24), IAdd (25), IAdd (27), IAdd (30), IAdd (33), IAdd (37), IAdd (41), IAdd (43), IAdd (46), IAdd (47), IAdd (48) and IAdd(58) all subtract f₂₄, IAdd(6), IAdd(8), IAdd(9), IAdd (14), IAdd (16), IAdd (21), IAdd (27), IAdd (35), IAdd (3 6), IAdd (38), IAdd (43), IAdd (45), IAdd (4 6), IAdd (47), IAdd (49), IAdd (55), IAdd (59), IAdd (61) and IAdd(63) all add f₂₅, and IAdd(1), IAdd(7), IAdd(12), IAdd(15), IAdd(17), IAdd(28), IAdd(31), IAdd(34), IAdd (37), IAdd (39), IAdd (42), IAdd (44), IAdd (48), IAdd (58) and IAdd (62) all subtract f₂₅.

During cycle 14, f₂₆ and f₂₇ are provided to the input bus. Additionally, IAdd(2), IAdd(6), IAdd(8), IAdd(13), IAdd(17), IAdd(25), IAdd(26), IAdd(34), IAdd(38), IAdd(44), IAdd(46), IAdd(47), IAdd(48), IAdd(49), IAdd(54), IAdd(61) and IAdd(63) all add f₂₆, IAdd(0), IAdd(7), IAdd(10), IAdd(19), IAdd(20), IAdd (22), IAdd (24), IAdd (26), IAdd (28), IAdd (30), IAdd (33), IAdd (42), IAdd (43), IAdd (48), IAdd (49), IAdd(53), IAdd(61), IAdd(62) and IAdd(63) all add f₂₇, and IAdd (6), IAdd (11), IAdd (13), IAdd (18), IAdd (21), IAdd (31), IAdd (32), IAdd (39), IAdd (40), IAdd (44), IAdd(45), IAdd(46), IAdd(47), IAdd(58) and IAdd(60) all subtract f₂₇.

During cycle 15, f₂₈ and f₂₉ are provided to the input bus. Additionally, IAdd(7), IAdd(11), IAdd(13), IAdd(19), IAdd(21), IAdd(24), IAdd(31), IAdd(33), IAdd(39), IAdd(40), IAdd(42), IAdd(43), IAdd(48), IAdd(49), IAdd(61), IAdd(62) and IAdd(63) all add f₂₈, IAdd (0), IAdd (6), IAdd (10), IAdd (18), IAdd (20), IAdd(22), IAdd(26), IAdd(28), IAdd(30), IAdd(32), IAdd (44), IAdd (45), IAdd (46), IAdd (47), IAdd (53) IAdd(58) and IAdd(60) all subtract f₂₈, IAdd(6), IAdd(9), IAdd (14), IAdd (17), IAdd (20), IAdd (23), IAdd (29), IAdd (34), IAdd (40), IAdd (41), IAdd (44), IAdd (46), IAdd(47), IAdd(48), IAdd(49), IAdd(61) and IAdd(63) all add f₂₉, and IAdd(2), IAdd(7), IAdd(8), IAdd(13), IAdd (16), IAdd (25), IAdd (26), IAdd (35), IAdd (36), IAdd (38), IAdd (42), IAdd (43), IAdd (45), IAdd (54), IAdd(58), IAdd(59) and IAdd(62) all subtract f₂₉.

During cycle 16, f₃₀ and f₃₁ are provided to the input bus. Additionally, IAdd(1), IAdd(6), IAdd(12), IAdd(15), IAdd(16), IAdd(28), IAdd(31), IAdd(35), IAdd (36), IAdd (37), IAdd (39), IAdd (43), IAdd (45), IAdd (46), IAdd (47), IAdd (49), IAdd (59), IAdd (61) and IAdd(63) all add f₃₀, IAdd(7), IAdd(8), IAdd(9), IAdd (14), IAdd (17), IAdd (21), IAdd (27), IAdd (34), IAdd (38), IAdd (42), IAdd (44), IAdd (48), IAdd (55), IAdd(58) and IAdd(62) all subtract f₃ 0, IAdd(7), IAdd(12), IAdd(18), IAdd(23), IAdd(25), IAdd(27), IAdd (30), IAdd (32), IAdd (37), IAdd (41), IAdd (42), IAdd(44), IAdd(45), IAdd(49), IAdd(60), IAdd(61), IAdd(62) and IAdd(63) all add f₃₁, and IAdd(3), IAdd(6), IAdd(10), IAdd(11), IAdd(15), IAdd(19), IAdd(22), IAdd (24), IAdd (29), IAdd (33), IAdd (43), IAdd (46), IAdd(47), IAdd(48), IAdd(56) and IAdd(58) all subtract f₃₁.

During cycle 17, f₃₂ and f₃₃ are provided to the input bus. Additionally, IAdd(6), IAdd(10), IAdd(11), IAdd(19), IAdd(23), IAdd(30), IAdd(33), IAdd(37), IAdd(42), IAdd(44), IAdd(45), IAdd(56), IAdd(60), IAdd(61), IAdd(62) and IAdd(63) all add f₃₂, IAdd(3), IAdd(7), IAdd(12), IAdd(15), IAdd(18), IAdd(22), IAdd (24), IAdd (25), IAdd (27), IAdd (29), IAdd (32), IAdd (41), IAdd (43), IAdd (46), IAdd (47), IAdd (48), IAdd(49) and IAdd(58) all subtract f₃₂, IAdd((1), IAdd(7), IAdd(8), IAdd(9), IAdd(15), IAdd(17), IAdd(27), IAdd (31), IAdd (34), IAdd (36), IAdd (37), IAdd (43), IAdd(45), IAdd(46), IAdd(47), IAdd(55), IAdd(59), IAdd(61) and IAdd(63) all add f₃₃, and IAdd(6), IAdd(12), IAdd (14), IAdd (16), IAdd (21), IAdd (28), IAdd (35), IAdd (38), IAdd (39), IAdd (42), IAdd (44), IAdd (48), IAdd(49), IAdd(58) and IAdd(62) all subtract f₃₃.

During cycle 18, f₃₅ and f₃₄ are provided to the input bus. Additionally, IAdd(7), IAdd(8), IAdd(14), IAdd (16), IAdd (23), IAdd (25), IAdd (2 6), IAdd (29), IAdd (35), IAdd (40), IAdd (44), IAdd (46), IAdd (47), IAdd (48), IAdd(54), IAdd(61) and IAdd(63) all add f₃₄, IAdd (2), IAdd (6), IAdd (9), IAdd (13), IAdd (17), IAdd (20), IAdd (34), IAdd (36), IAdd (38), IAdd (41), IAdd (42), IAdd (43), IAdd (45), IAdd (49), IAdd (58), IAdd (59) and IAdd(62) all subtract f₃₄, IAdd(6), IAdd (10), IAdd (13), IAdd (18), IAdd (20), IAdd (21), IAdd(24), IAdd(26), IAdd(28), IAdd(31), IAdd(32), IAdd (40), IAdd (42), IAdd (43), IAdd (48), IAdd (53), IAdd (61), IAdd(62) and IAdd(63) all add f₃₅, and IAdd(0), IAdd (7), IAdd(11), IAdd(19), IAdd(22), IAdd(30), IAdd (33), IAdd (39), IAdd (44), IAdd (45), IAdd (46), IAdd (47), IAdd(49), IAdd(58) and IAdd(60) all subtract f₃₅.

During cycle 19, f₃₇ and f₃₆ are provided to the input bus. Additionally, IAdd(0), IAdd(6), IAdd(11), IAdd (18), IAdd (22), IAdd (24), IAdd (30), IAdd (32), IAdd(39), IAdd(42), IAdd(43), IAdd(48), IAdd(61), IAdd(62) and IAdd(63) all add f₃₆, IAdd(7), IAdd(10), IAdd(13), IAdd(19), IAdd(20), IAdd(21), IAdd(26), IAdd (28), IAdd (31), IAdd (33), IAdd (40), IAdd (44), IAdd (45), IAdd (46), IAdd (47), IAdd (49), IAdd (53), IAdd(58) and IAdd(60) all subtract f₃₆, IAdd(2), IAdd(7), IAdd (9, IAdd (13), IAdd (16), IAdd (20), IAdd (35), IAdd (38), IAdd (41), IAdd (44), IAdd (46), IAdd (47), IAdd(48), IAdd(61) and IAdd(63) all add f₃₇, and IAdd(6), IAdd (8), IAdd (14), IAdd (17), IAdd (23), IAdd (25), IAdd (2 6), IAdd (29), IAdd (3 4), IAdd (36), IAdd (40), IAdd(42), IAdd(43), IAdd(45), IAdd(49), IAdd(54), IAdd (58), IAdd (59) and IAdd (62) all subtract f₃₇.

During cycle 20, f₃₉ and f₃₈ are provided to the input bus. Additionally, IAdd(7), IAdd(12), IAdd(14), IAdd (17), IAdd (21), IAdd (28), IAdd (34), IAdd (36), IAdd (38), IAdd (39), IAdd (43), IAdd (45), IAdd (46), IAdd(47), IAdd(59), IAdd(61) and IAdd(63) all add f₃₈, IAdd(1), IAdd(6), IAdd(8), IAdd(9), IAdd(15), IAdd(16), IAdd (27), IAdd (31), IAdd (35), IAdd (37), IAdd (42), IAdd(44), IAdd(48), IAdd(49), IAdd(55), IAdd(58) and IAdd(62) all subtract f₃₈, IAdd(3), IAdd(6), IAdd(12), IAdd(15), IAdd(19), IAdd(22), IAdd(25), IAdd(27), IAdd (29), IAdd (33), IAdd (41), IAdd (42), IAdd (44), IAdd(45), IAdd(60), IAdd(61), IAdd(62) and IAdd(63) all add f₃₉, and IAdd(7), IAdd(10), IAdd(11), IAdd(18), IAdd (23), IAdd (24), IAdd (30), IAdd (32), IAdd (37), IAdd (43), IAdd (46), IAdd (47), IAdd (48), IAdd (49), IAdd(56) and IAdd(58) all subtract f₃₉.

During cycle 21, f₄₀ and f₄₁ are provided to the input bus. Additionally, IAdd(5), IAdd(28), IAdd(31), IAdd (34), IAdd (38), IAdd (43), IAdd (47), IAdd (56), IAdd (60), IAdd (62) and IAdd (63) all add f₄₀, IAdd (2), IAdd(4), IAdd(10), IAdd(11), IAdd(14), IAdd(15), IAdd(17), IAdd(19), IAdd(20), IAdd(21), IAdd(26), IAdd (32), IAdd (36), IAdd (37), IAdd (39), IAdd (42), IAdd (44), IAdd (45), IAdd (46), IAdd (48), IAdd (50), IAdd(57) and IAdd(61) all subtract f₄₀, IAdd(3), IAdd(4), IAdd (18), IAdd (20), IAdd (21), IAdd (25), IAdd (30), IAdd (31), IAdd (33), IAdd (35), IAdd (42), IAdd (43), IAdd(44), IAdd(46), IAdd(48), IAdd(55), IAdd(59) and IAdd(63) all add f₄₁, and IAdd(5), IAdd(8), IAdd(9), IAdd(13), IAdd(16), IAdd(22), IAdd(24), IAdd(26), IAdd(40), IAdd(41), IAdd(45), IAdd(47), IAdd(50), IAdd(57), IAdd(61) and IAdd(62) all subtract f₄₁.

During cycle 22, f₄₂ and f₄₃ are provided to the input bus. Additionally, IAdd(4), IAdd(9), IAdd(15), IAdd (16), IAdd (22), IAdd (23), IAdd (24), IAdd (27), IAdd(28), IAdd(30), IAdd(39), IAdd(42), IAdd(45), IAdd(46), IAdd(54) and IAdd(63) all add f₄₂, IAdd(0), IAdd(5), IAdd(8), IAdd(12), IAdd(18), IAdd(29), IAdd (33), IAdd (35), IAdd (37), IAdd (43), IAdd (44), IAdd(47), IAdd(48), IAdd(50), IAdd(57), IAdd(59), IAdd(61) and IAdd(62) all subtract f₄₂, IAdd(1), IAdd(5), IAdd(11), IAdd(14), IAdd(17), IAdd(19), IAdd(25), IAdd (32), IAdd (36), IAdd (38), IAdd (40), IAdd (41), IAdd (44), IAdd (45), IAdd (47), IAdd (48), IAdd (53), IAdd (62) and IAdd (63) all add f₄₃, and IAdd (4), IAdd (10), IAdd(12), IAdd(13), IAdd(23), IAdd(27), IAdd(29), IAdd (34), IAdd (42), IAdd (43), IAdd (46), IAdd (50), IAdd(57), IAdd(60) and IAdd(61) all subtract f₄₃.

During cycle 23, f₄₅ and f₄₆ are provided to the input bus. Additionally, IAdd(5), IAdd(10), IAdd(12), IAdd(13), IAdd(17), IAdd(19), IAdd(23), IAdd(27), IAdd (29), IAdd (32), IAdd (36), IAdd (44), IAdd (45), IAdd (47), IAdd (48), IAdd (62) and IAdd (63) all add f₄₅, IAdd (1), IAdd (4), IAdd (11), IAdd (14), IAdd (25), IAdd (34), IAdd (38), IAdd (40), IAdd (41), IAdd (42), IAdd(43), IAdd(46), IAdd(50), IAdd(53), IAdd (57), IAdd(60) and IAdd(61) all subtract f₄₅, IAdd (0), IAdd (4), IAdd (8), IAdd (12), IAdd (16), IAdd (24), IAdd (29), IAdd (37), IAdd (42), IAdd (45), IAdd (46) and IAdd (63) all add f₄₅, IAdd (5), IAdd (9, IAdd (15), IAdd(18), IAdd(22), IAdd (23), IAdd (27), IAdd (28), IAdd (30), IAdd (33), IAdd (35), IAdd (39), IAdd (43), IAdd (44), IAdd(47), IAdd (48), IAdd (50), IAdd (54), IAdd (57), IAdd (59), IAdd(61) and IAdd(62) all subtract f₄₅.

During cycle 24, f₄₇ and f₄₆ are provided to the input bus. Additionally, IAdd (4), IAdd (8), IAdd (9), IAdd(13), IAdd(18), IAdd(22), IAdd(26), IAdd(33), IAdd (35), IAdd (40), IAdd (41), IAdd (42), IAdd (43), IAdd(44), IAdd(46), IAdd(48), IAdd(59) and IAdd(63) all add f₄₆, IAdd(3), IAdd(5), IAdd(16), IAdd(20), IAdd(21), IAdd (24), IAdd (25), IAdd (30), IAdd (31), IAdd (45), IAdd(47), IAdd(50), IAdd(55), IAdd(57), IAdd(61) and IAdd(62) all subtract f₄₆, IAdd(2), IAdd(5), IAdd(10), IAdd(11), IAdd(14), IAdd(15), IAdd(20), IAdd(21), IAdd (26), IAdd (34), IAdd (37), IAdd (39), IAdd (43), IAdd (47), IAdd (60), IAdd (62) and IAdd (63) all add f₄₇ and IAdd (4), IAdd (17), IAdd (19, IAdd(28), IAdd(31), IAdd (32), IAdd (36), IAdd (38), IAdd (42), IAdd (44), IAdd(45), IAdd(46), IAdd(48), IAdd(50), IAdd(56), IAdd(57) and IAdd(61) all subtract f₄₇.

During cycle 25, f₄₉ and f₄₈ are provided to the input bus. Additionally, IAdd(20), IAdd(26), IAdd(35), IAdd (36), IAdd (39), IAdd (40), IAdd (46), IAdd (48), IAdd(56), IAdd(57), IAdd(60), IAdd(62) and IAdd(63) all add f₄₈, IAdd(1), IAdd(4), IAdd(5), IAdd(10), IAdd(11), IAdd(13), IAdd(14), IAdd(16), IAdd(18), IAdd(23), IAdd(28), IAdd(29), IAdd(33), IAdd(3S), IAdd(42), IAdd (43), IAdd (44), IAdd (45), IAdd (47), IAdd (51) and IAdd(61) all subtract f₄₈, IAdd(4), IAdd(5), IAdd(19), IAdd (22), IAdd (24), IAdd (26), IAdd (29), IAdd (30), IAdd (32), IAdd (34), IAdd (37), IAdd (41), IAdd (42) IAdd (44), IAdd (47), IAdd (55), IAdd (57), IAdd (59 and IAdd (63) all add f₄₉, and IAdd (0), IAdd (8), IAdd (9), IAdd(15), IAdd(17), IAdd(20), IAdd(23), IAdd(25), IAdd (43), IAdd (45), IAdd (46), IAdd (48), IAdd (51), IAdd(61) and IAdd(62) all subtract f₄₉. Also, IAdd(49) provides σ₄₉ to Sbus(5).

During cycle 26, f₅₁ and f₅₀ are provided to the input bus. Additionally, IAdd(4), IAdd(5), IAdd(9), IAdd(12), IAdd(17), IAdd(21), IAdd(22), IAdd(31), IAdd (42), IAdd (43), IAdd (45), IAdd (47), IAdd (48), IAdd(54), IAdd(57) and IAdd(63) all add f₅₀, IAdd(3), IAdd (8), IAdd (13), IAdd (19), IAdd (24), IAdd (27), IAdd (28), IAdd (30), IAdd (32), IAdd (34), IAdd (39), IAdd (40), IAdd (44), IAdd (46), IAdd (51), IAdd (59), IAdd(61) and IAdd(62) all subtract f₅₀, IAdd(11), IAdd(12), IAdd(15), IAdd(16), IAdd(18), IAdd(21), IAdd (27), IAdd (33), IAdd (37), IAdd (38), IAdd (43), IAdd (44), IAdd (45), IAdd (46), IAdd (53), IAdd (57), IAdd (62) and IAdd (63) all add f₅₁, and IAdd (2), IAdd (4), IAdd (5), IAdd (10), IAdd (14), IAdd (25), IAdd (31), IAdd (35), IAdd (36), IAdd (41), IAdd (42), IAdd (47), IAdd(48), IAdd(51), IAdd(60) and IAdd(61) all subtract f₅₁. Also, Rmult(3) provides f₃ ·σ₄₉ to Pbus (6), Rmult (5) provides r₅ ·σ₄₉ to Pbus (5), Rmult (7) provides r₇ ·σ₄₉ to Pbus(4), Rmult(1) provides r₁ ·σ₄₉ to Pbus(3).

During cycle 27, f₅₃ and f₅₂ are provided to the input bus. Additionally, IAdd (2), IAdd (10, IAdd (14), IAdd (16), IAdd (18), IAdd (25), IAdd (31), IAdd (33), IAdd (41), IAdd (43), IAdd (44), IAdd (45), IAdd (46), IAdd(57), IAdd(62) and IAdd(63) all add f₅₂, IAdd(4), IAdd(5), IAdd(11), IAdd(12), IAdd(15), IAdd(21), IAdd (27), IAdd (35), IAdd (36), IAdd (37), IAdd (38), IAdd (42), IAdd (47), IAdd (48), IAdd (5 1), IAdd (53), IAdd(60) and IAdd(61) all subtract f₅₂, IAdd(3), IAdd(4), IAdd(5), IAdd(8), IAdd(13), IAdd(17), IAdd(27), IAdd (28), IAdd (30), IAdd (39), IAdd (40), IAdd (42), IAdd (43), IAdd (45), IAdd (47), IAdd (48), IAdd (57) and IAdd (63) all add f₅₃, and IAdd (9), IAdd (12), IAdd (19), IAdd (21), IAdd (22), IAdd (24), IAdd (31), IAdd (32), IAdd(34), IAdd(44), IAdd(46), IAdd(51), IAdd(54), IAdd(59), IAdd(61) and IAdd(62) all subtract f₅₃. Also, IAdd(50) provides σ₅₀ to Sbus(5).

During cycle 28, f₅₅ and f₅₄ are provided to the input bus. Additionally, IAdd(0), IAdd(4), IAdd(5), IAdd(8), IAdd(9), IAdd(15), IAdd(19), IAdd(20), IAdd (23), IAdd (24), IAdd(25), IAdd (32), IAdd (34), IAdd (42), IAdd (44), IAdd (47), IAdd (57), IAdd (59) and IAdd(63) all add f₅₄, IAdd(17), IAdd(22), IAdd(26), IAdd (29), IAdd (30), IAdd (37), IAdd (41), IAdd (43), IAdd (45), IAdd (46), IAdd (48), IAdd (51), IAdd (55), IAdd(61) and IAdd(62) all subtract f₅₄, IAdd(1), IAdd (10), IAdd (11), IAdd (13), IAdd (14), IAdd (23), IAdd(28), IAdd(29), IAdd(35), IAdd(36), IAdd(38), IAdd(46), IAdd(48), IAdd(57), IAdd(60), IAdd(62) and IAdd(63) all add f₅₅, and IAdd(4), IAdd(5), IAdd(16), IAdd (18), IAdd (20), IAdd (26), IAdd (33), IAdd (39), IAdd(40), IAdd(42), IAdd(43), IAdd(44), IAdd(45), IAdd(47), IAdd(51), IAdd(56) and IAdd(61) all subtract f₅₅. Also, Rmult(5) provides r₅ ·σ₅₀ to Pbus(6), Rmult(1) provides r₁ ·σ₅₀ to Pbus (5), Rmult (7) provides r₇ ·σ₅₀ to Pbus (4), and Rmult (3) provides r₃ ·σ₅₀ to Pbus (3).

During cycle 29, f₅₆ and f₅₇ are provided to the input bus. Additionally, IAdd (10), IAdd (11), IAdd (12), IAdd (24), IAdd (25), IAdd (27), IAdd (41), IAdd (421), IAdd (43), IAdd (44), IAdd (45), IAdd (46), IAdd (47), IAdd(48), IAdd(56), IAdd(58), IAdd(60), IAdd(61), IAdd(62) and IAdd(63) all add f₅₆, IAdd(0), IAdd(6), IAdd(7), IAdd(13), IAdd(16), IAdd(17), IAdd(21), IAdd (22), IAdd (30), IAdd (31), IAdd (34), IAdd (35), IAdd(40) and IAdd(52) all subtract f₅₆, IAdd(2), IAdd(6), IAdd(7), IAdd(8), IAdd(9), IAdd(12), IAdd(23), IAdd(28), IAdd(29), IAdd(32), IAdd(33), IAdd(38), IAdd(39), IAdd(40), IAdd(45), IAdd(48), IAdd(55), IAdd(58), IAdd(59), IAdd(61) and IAdd(63) all add f₅₇, and IAdd(13), IAdd(14), IAdd(18), IAdd(19), IAdd(27), IAdd (36), IAdd (42), IAdd (43), IAdd (44), IAdd (46), IAdd (47), IAdd (52) and IAdd (62) all subtract f₅₇. Also, IAdd (57) provides σ₅₇ to Sbus (5), IAdd (5) provides σ₅ to Sbus (4), IAdd (4) provides σ₄ to Sbus (3), and IAdd (51) provides σ₅₁ to Sbus (2).

During cycle 30, f₅₉ and f₅₈ are provided to the input bus. Additionally, IAdd(1), IAdd(6), IAdd(7), IAdd(8), IAdd(18), IAdd(19), IAdd(20), IAdd(21), IAdd(36), IAdd(41), IAdd(43), IAdd(44), IAdd(54), IAdd(58), IAdd(61) and IAdd(63) all add f₅₈, IAdd(9), IAdd(14), IAdd(15), IAdd(25), IAdd(26), IAdd(31), IAdd(32), IAdd (33), IAdd (37), IAdd (38), IAdd (42), IAdd (45), IAdd(46), IAdd(47), IAdd(48), IAdd(52), IAdd(59) and IAdd(62) all subtract f₅₈ , IAdd(3), IAdd(10), IAdd(16), IAdd (17), IAdd (22), IAdd (23), IAdd (34), IAdd (35), IAdd (37), IAdd (39), IAdd (42), IAdd (46), IAdd (47), IAdd(53), IAdd(58), IAdd(61), IAdd(62) and IAdd(63) all add f₅₉, and IAdd(6), IAdd(7), IAdd(11), IAdd(15), IAdd(20), IAdd(24), IAdd(26), IAdd(28), IAdd(29), IAdd (30), IAdd (43), IAdd (44), IAdd (45), IAdd (4 8), IAdd(52) and IAdd(60) all subtract f₅₉. Also, Rmult(2) provides r₂ ·σ₅₇ to Pbus (6), Rmult (3) provides r₃ ·σ₅₁ to Pbus (5), Cmult (7) provides c₇ ·σ₄ to Pbus (4), Cmult (5) provides c₅ ·σ₅ to Pbus(3), Rmult(6) provides r₆ ·σ₅₇ to Pbus (2), Cmult (3) provides c₃ ·σ₅ to Pbus (1), Cmult (1) provides c₁ ·σ₄ to Pbus (0).

During cycle 31, f₆₁ and ₆₀ are provided to the input bus. Additionally, IAdd(11), IAdd(15), IAdd(16), IAdd (17), IAdd (20), IAdd (26), IAdd (28), IAdd (29), IAdd (30), IAdd (34), IAdd (35), IAdd (42), IAdd (46), IAdd(47), IAdd(58), IAdd(61), IAdd(62) and IAdd(63) all add f₆₀, IAdd(31), IAdd(6), IAdd(7), IAdd(10), IAdd(22), IAdd (23), IAdd (24), IAdd (37), IAdd (39), IAdd (43), IAdd (44), IAdd (45), IAdd (48), IAdd (52), IAdd (53) and IAdd(60) all subtract f₆₀, IAdd(6), IAdd(7), IAdd(9), IAdd(14), IAdd(15), IAdd(18), IAdd(19), IAdd(25), IAdd (26), IAdd (31), IAdd (36), IAdd (37), IAdd (38), IAdd(43), IAdd(44), IAdd(58), IAdd(61) and IAdd(63) all add ₆₁, and IAdd (1), IAdd (8), IAdd (20), IAdd (21), IAdd (32), IAdd (33), IAdd (41), IAdd (42), IAdd (45), IAdd(46), IAdd(47), IAdd(48), IAdd(52), IAdd(54), IAdd(59) and IAdd(62) all subtract f₆₁. Also, Cmult(1) provides c₁ ·σ₅ to Pbus (6), Cmult (5) provides c₅ ·σ₄ to Pbus(5), Rmult(5) provides r₅ ·σ₅₁ to Pbus(4), Cmult(7) provides c₇ ·σ₅ to Pbus(3), Rmult(1) provides r₁ ·σ₅₁ to Pbus(2), Cmult(3) provides c₃ ·σ₄ to Pbus(1), Rmult(7) provides r₇ ·σ₅₁ to Pbus (0).

During cycle 32, f₆₃ and f₆₂ are provided to the input bus. Additionally, IAdd(6), IAdd(7), IAdd(13), IAdd (14), IAdd (27), IAdd (32), IAdd (33), IAdd (45), IAdd(48), IAdd(58), IAdd(59), IAdd(61) and IAdd(63) all add f₆₂, IAdd (2), IAdd (8), IAdd (9), IAdd (12), IAdd (18), IAdd(19), IAdd(23), IAdd(28), IAdd(29), IAdd(36), IAdd (38), IAdd (39), IAdd (40), IAdd (42), IAdd (43), IAdd (44), IAdd (46), IAdd (47), IAdd (52), IAdd (55) and IAdd(62) all subtract f₆₂, IAdd(0), IAdd(13), IAdd(21), IAdd (22), IAdd (24), IAdd (30), IAdd (31), IAdd (40), IAdd (42), IAdd (43), IAdd (44), IAdd (45), IAdd (46), IAdd(47), IAdd(48), IAdd(58), IAdd(60), IAdd(61), IAdd(62) and IAdd(63) all add f₆₃, and IAdd(6), IAdd(7), IAdd(10), IAdd(11), IAdd(12), IAdd(16), IAdd(17), IAdd (25), IAdd (2 7), IAdd (34), IAdd (35), IAdd (41), IAdd(52) AND IAdd(56) all subtract f₆₃. Also, IAdd(37) provides σ₃₇ to Sbus(5), IAdd(53) provides σ₅₃ to Sbus(4), IAdd(54) provides σ₅₄ to Sbus(3), IAdd(15) provides σ₁₅ to Sbus (2), IAdd (3) provides σ₃ to Sbus (1), and IAdd (41) provides σ₄₁ to Sbus (0).

During cycle 33, IAdd(59) provides σ₅₉ to Sbus(5), IAdd(19) provides σ₁₉ to Sbus(4), IAdd(39) provides σ₃₉ to Sbus(3), IAdd(11) provides σ₁₁ to Sbus(2), IAdd(32) provides σ₃₂ to Sbus(1), and IAdd(35) provides σ₃₅ to Sbus(0). Additionally, Cmult(5) provides c₅ ·σ₄₁ to Pbus(6), Rmult(1) provides r₁ ·σ₅₃ to Pbus(5), Cmult(2) provides c₂ ·σ₃₇ to Pbus (4), Cmult (6) provides c₆ ·σ₁₅ to Pbus(3), Cmult(7) provides c₇ ·σ₄₁ to Pbus(2), Cmult(1) provides c₁ ·σ₄₁ to Pbus (1), and Rmult (3) provides r₃ ·σ₅₄ to Pbus (0).

During cycle 34, IAdd(45) provides σ₄₅ to Sbus(5), IAdd (20) provides σ₂₀ to Sbus (4), IAdd (7) provides σ₇ to Sbus (3), IAdd (13) provides σ₁₃ to Sbus (2), IAdd (28) provides σ₂₈ to Sbus (1), and IAdd (60) provides σ₆₀ to Sbus (0). Additionally, Cmult (1) provides c₁ ·σ₃₂ to Pbus (6), Rmult (7) provides r₇ ·σ₅₄ to Pbus (5), Cmult (3) provides c₃ ·₃₉ to Pbus (4), Cmult (7) provides c₇ ·σ₃₉ to Pbus(3), Rmult(3) provides r₃ ·σ₅₃ to Pbus(2), Cmult(5) provides c₅ ·σ₃₉ to Pbus (1), Rmult (6) provides r₆ ·σ₅₉ to Pbus (0).

During cycle 35, IAdd(29) provides σ₂₉ to Sbus(5), IAdd (55) provides σ₅₅ to Sbus (4), IAdd (8) provides σ₈ to Sbus (3), IAdd (18) provides σ₁₈ to Sbus (2), IAdd (12) provides σ₁₂ to Sbus (1), IAdd (58) provides σ₅₈ to Sbus (0). Additionally, Cmult(1) provides c₁ ·σ₃₉ to Pbus(6), Cmult(2) provides c₂ ·σ₁₃ to Pbus(5), Rmult(2) provides r₂ ·σ₅₉ to Pbus (4), Cmult (5) provides c₅ ·σ₁₉ to Pbus (3), Cmult (3) provides c₃ ·σ₇ to Pbus (2), Rmult (7) provides r₇ ·σ₅₃ to Pbus (1), and Cmult (6) provides c₆ ·σ₄₅ to Pbus (0).

During cycle 36, IAdd (3 3) provides σ₃₃ to Sbus (5), IAdd (36) provides σ₃₆ to Sbus (4), IAdd (2) provides σ₂ to Sbus(3), IAdd(16) provides σ₁₆ to Sbus(2), IAdd(22) provides σ₂₂ to Sbus(1), and IAdd(27) provides σ₂₇ to Sbus(0). Additionally, Cmult(1) provides c₁ ·σ₈ to Pbus (6), Cmult (7) provides c₇ ·σ₁₈ to Pbus (5), Cmult (3) provides c₃ ·σ₂₈ to Pbus (4), Rmult (6) provides r₆ ·σ₅₈ to Pbus (3), Cmult (5) provides c₅ ·σ₃₅ to Pbus (2), Rmult (2) provides r₂ ·σ₆₀ to Pbus (1), and Rmult (3) provides r₃ ·σ₅₅ to Pbus (0).

During cycle 37, IAdd(30) provides σ₃₀ to Sbus(5), IAdd (6) provides σ₆ to Sbus (4), IAdd (9) provides σ₉ to Sbus(3), IAdd(42) provides σ₄₂ to Sbus(2), IAdd(48) provides σ₄₈ to Sbus (1), and IAdd (46) provides σ₄₆ to Sbus(0). Additionally, Cmult(1) provides c₁ ·σ₁₉ to Pbus (6), Cmult (5) provides c₅ ·σ₁₂ to Pbus (5), Cmult (3) provides c₃ ·σ₁₂ to Pbus (4), Cmult (2) provides c₂ ·σ₄₅ to Pbus (3), Cmult (6) provides c₆ ·σ₂₉ to Pbus (2), Cmult (7) provides c₇ ·σ₁₆ to Pbus (1), and Rmult (1) provides r₁ ·σ₅₄ to Pbus (0).

During cycle 38, IAdd(26) provides σ₂₆ to Sbus(5), IAdd(17) provides σ₁₇ to Sbus(4), IAdd(62) provides σ₆₂ to Sbus(3), IAdd(1) provides σ₁ to Sbus(2), IAdd(25) provides σ₂₅ to Sbus(1), and IAdd(44) provides σ₄₄ to Sbus(0). Additionally, Cmult(0) provides c₀ ·σ₂ to Pbus(6), Cmult(1) provides c₁ ·σ₁₁ to Pbus(5), Cmult(3) provides c₃ ·σ₁₁ to Pbus(4), Cmult(0) provides c₀ ·σ₃ to Pbus(3), Cmult(5) provides c₅ ·σ₃₂ to Pbus(2), Cmult(4) provides c₄ ·σ₃₀ to Pbus(1), and Cmult(7) provides c₇ ·σ₃₃ to Pbus (0).

During cycle 39, IAdd (31) provides σ₃₁ to Sbus (5), IAdd (21) provides σ₂₁ to Sbus (4), IAdd (43) provides σ₄₃ to Sbus(3), IAdd(47) provides σ₄₇ to Sbus(2), IAdd(34) provides σ₃₄ to Sbus(1), and IAdd(63) provides σ₆₃ to Sbus(0). Additionally, Cmult(7) provides c₇ ·σ₂₀ to Pbus(6), Cmult(6) provides c₆ ·σ₁₃ to Pbus(5), Cmult(3) provides c₃ ·σ₁₇ to Pbus (4), Cmult (2) provides c₂ ·σ₄₆ to Pbus(3), Rmult(2) provides r₂ ·σ₅₈ to Pbus(2), Cmult(5) provides c₅ ·σ₇ to Pbus (1), Cmult (1) provides c₁ ·σ₆ to Pbus (0).

During cycle 40, IAdd(0) provides σ₀ to Sbus(5), IAdd (38) provides σ₃₈ to Sbus (4), IAdd (56) provides σ₅₆ to Sbus(3), IAdd(23) provides σ₂₃ to Sbus(2), IAdd(24) provides σ₂₄ to Sbus (1), and IAdd(61) provides σ₆₁ to Sbus(0). Additionally, Cmult(4) provides c₄ ·σ₄₈ to Pbus(6), Rmult(6) provides r₆ ·σ₆₀ to Pbus(5), Cmult(2) provides c₂ ·σ₄₇ to Pbus (4), Cmult (1) provides c₁ ·σ₁₇ to Pbus(3), Cmult(5) provides c₅ ·σ₂₇ to Pbus(2), Cmult(3) provides c₃ ·σ₉ to Pbus(1), and Cmult(7) provides c₇ σ₁₉ to Pbus (0).

During cycle 41, IAdd(52) provides σ₅₂ to Sbus(3), IAdd (14) provides σ₁₄ to Sbus (2), IAdd (40) provides σ₄₀ to Sbus (1), and IAdd (10) provides σ₁₀ to Sbus (0). Additionally, Cmult(3) provides c₃ ·σ₁₆ to Pbus(6), Cmult (7) provides c₇ ·σ₃₅ to Pbus (5), Cmult (1) provides c₁ ·σ₃₅ to Pbus(4), Cmult(6) provides c₆ ·σ₄₇ to Pbus(3), Cmult(0) provides c₀ ·σ₃₆ to Pbus(2), Cmult(2) provides c₂ ·σ₂₁ to Pbus (1), Cmult (0) provides c₀ ·σ₁ to Pbus (0).

During cycle 42, Cmult(1) provides c₁ ·σ₃₃ to Pbus(6), Cmult (5) provides c₅ ·σ₈ to Pbus (5), Cmult (0) provides c₀ ·σ₂₄ to Pbus (4), Cmult (3) provides c₃ ·σ₃₃ to Pbus (3), Cmult (0) provides c₀ ·σ₀ to Pbus (2), Cmult (7) provides c₇ ·σ₂₈ to Pbus (1), and Rmult (5) provides r₅ ·σ₅₆ to Pbus (0).

During cycle 43, Cmult(1) provides c₁ ·σ₂₆ to Pbus(6), Cmult (3) provides c₃ ·σ₁₈ to Pbus (5), Cmult (5) provides c₅ ·σ₃₃ to Pbus (4), Rmult (7) provides r₇ ·σ₅₂ to Pbus (3), Rmult (3) provides r₃ ·σ₅₂ to Pbus (2), Cmult (6) provides c₆ ·σ₂₁ to Pbus (1), and Cmult (7) provides c₇ ·σ₂₆ to Pbus (0).

During cycle 44, Cmult(7) provides c₇ ·σ₉ to Pbus(6), Cmult(3) provides c₃ ·σ₆ to Pbus(5), Cmult(4) provides c₄ ·σ₁₄ to Pbus(4), Cmult(5) provides c₅ ·σ₆ to Pbus (3), Cmult (1) provides c₁ ·σ₉ to Pbus (2), Cmult (6) provides c₆ ·σ₄₄ to Pbus(1), and Cmult(2) provides c₂ ·σ₁₅ to Pbus (0).

During cycle 45, Cmult(5) provides c₅ ·σ₁₆ to Pbus(6), Rmult(4) provides r₄ ·σ₆₁ to Pbus(5), Rmult(1) provides r₁ ·σ₅₂ to Pbus(4), Cmult(7) provides c₇ ·σ₁₂ to Pbus(3), Cmult (6) provides c₆ ·σ₂₃ to Pbus (2), Cmult (1) provides c₁ ·σ₂₅ to Pbus (1), and Cmult (2) provides c₂ ·σ₄₄ to Pbus (0).

During cycle 46, Rmult (4) provides r₄ ·σ₆₂ to Pbus (6), Cmult(2) provides c₂ ·σ₄₀ to Pbus(5), Cmult(5) provides c₅ ·σ₂₀ to Pbus (4), Cmult (3) provides c₃ ·σ₂₇ to Pbus (3), Cmult (4) provides c₄ ·σ₃₈ to Pbus (2), Cmult (7) provides c₇ ·σ₇ to Pbus(1), and Cmult(6) provides c₆ ·σ₃₁ to Pbus(0).

During cycle 47, Rmult(7) provides r₇ ·σ₅₅ to Pbus(6), Cmult(7) provides c₇ ·σ₂₅ to Pbus(5), Cmult(1) provides c₁ ·σ₂₀ to Pbus(4), Cmult(2) provides c₂ ·σ₂₉ to Pbus(3), Cmult (4) provides c₄ ·σ₂₂ to Pbus (2), Cmult (6) provides c₆ ·σ₄₀ to Pbus (1), and Cmult (3) provides c₃ ·σ₈ to Pbus (0).

During cycle 48, Cmult(5) provides c₅ ·σ₉ to Pbus (6), Rmult (5) provides r₅ ·σ₅₅ to Pbus (5), Cmult (1) provides c₁ ·σ₁₂ to Pbus (4), Cmult (2) provides c₂ ·σ₃₁ to Pbus (3), Cmult (3) provides c₃ ·σ₂₅ to Pbus (2), Rmult (7) provides r₇ ·σ₅₆ to Pbus (1), and Cmult (7) provides c₇ ·σ₃₄ to Pbus (0).

During cycle 49, Rmult(1) provides r₁ ·σ₅₆ to Pbus(6), Cmult (7) provides c₇ ·σ₁₇ to Pbus (5), Cmult (1) provides c₁ ·σ₂₈ to Pbus(4), Cmult(5) provides c₅ ·σ₂₈ to Pbus(3), Cmult (0) provides c₀ ·σ₄₂ to Pbus (2), Cmult (2) provides c₂ ·c₂₃ to Pbus (1), and Cmult (3) provides c₃ ·σ₁₉ to Pbus (0).

During cycle 50, Cmult(1) provides c₁ ·σ₇ to Pbus(6), Rmult(5) provides r₅ ·σ₅₄ to Pbus(5), Cmult(7) provides c₇ ·σ₆ to Pbus (4), Cmult (3) provides c₃ ·σ₃₄ to Pbus(3), Cmult(5) provides c₅ ·σ₃₄ to Pbus(2), Cmult(0) provides c₀ ·σ₆₃ to Pbus(1), and Cmult(6) provides c₆ ·σ₃₇ to Pbus (0).

During cycle 51, Cmult(5) provides c₅ ·σ₂₅ to Pbus(6), Cmult (4) provides c₄ ·σ₄₃ to Pbus (5), Cmult (6) provides c₆ ·σ₄₆ to Pbus (4), Cmult (1) provides c₁ ·σ₁₆ to Pbus (3), Cmult (3) provides c₃ ·σ₃₅ to Pbus (2), Rmult (1) provides r₁ ·σ₅₅ to Pbus (1), and Rmult (3) provides r₃ ·σ₅₆ to Pbus (0). Additionally, OAdd(1) and OAdd(0) provide F₁ and F₀ to the output bus.

During cycle 52, Cmult(5) provides c₅ ·σ₁₀ to Pbus(6), Cmult (1) provides c₁ ·σ₁₀ to Pbus (5), Cmult (7) provides c₇ ·σ₁₁ to Pbus (4), Rmult (5) provides r₅ ·σ₅₂ to Pbus (3), Cmult (1) provides c₁ ·σ₁₈ to Pbus (2), Cmult (5) provides c₅ ·σ₁₈ to Pbus(1), Cmult(3) provides c₃ ·σ₂₀ to Pbus(0). Additionally, OAdd(16) and OAdd(8) provide F₁₆ and F₈ to the output bus.

During cycle 53, Cmult(3) provides c₃ ·σ₁₀ to Pbus(6), Cmult (7) provides C₇ ·σ₁₀ to Pbus (5), Cmult (3) provides c₃ ·σ₂₆ to Pbus (4), Cmult (5) provides c₅ ·σ₁₁ to Pbus (3), Cmult (1) provides c₁ ·σ₃₄ to Pbus (2), Cmult (7) provides c₇ ·σ₂₇ to Pbus (1), Rmult (5) provides r₅ ·σ₅₃ to Pbus (0). Additionally, OAdd(9) and OAdd(2) provide F₉ and F₂ to the output bus.

During cycle 54, Cmult(7) provides c₇ ·σ₈ to Pbus(3), Cmult(3) provides c₃ ·σ₄₁ to Pbus(2), Cmult(5) provides c₅ ·σ₁₇ to Pbus (1), Cmult (1) provides c₁ ·σ₂₇ to Pbus(0). Additionally, OAdd(10) and OAdd(3) provide F₁₀ and F₃ to the output bus.

During cycle 55, Cmult(5) provides c₅ ·σ₂₆ to Pbus(2), Cmult(3) provides c₃ ·σ₃₂ to Pbus(1), Cmult(7) provides c₇ ·σ₃₂ to Pbus(0). Also, OAdd(24) and OAdd(17) provide F₂₄ and F₁₇ to the output bus.

During cycle 56, OAdd(32) and OAdd(25) provide F₃₂ and F₂₅ to the output bus. During cycle 57, OAdd(18) and OAdd(11) provide F₁₈ and F₁₁ to the output bus. During cycle 58, OAdd(5) and OAdd(4) provide F₅ and F₄ to the output bus. During cycle 59, OAdd(19) and OAdd(12) provide F₁₉ and F₁₂ to the output bus. During cycle 60, OAdd (33) and OAdd (26) provide F₃₃ and F₂₆ to the output bus. During cycle 61, OAdd(48) and OAdd(40) provide F₄₈ and F₄₀ to the output bus. During cycle 62, OAdd(41) and OAdd(34) provide F₄₁ and F₃₄ to the output bus. During cycle 63, OAdd(27) and OAdd(20) provide F₂₇ and F₂₀ to the output bus. During cycle 64, OAdd (13) and OAdd (6) provide F₁₃ and F₆ to the output bus. During cycle 65, OAdd(14) and OAdd(7) provide F₁₄ and F₇ to the output bus. During cycle 66, OAdd(28) and OAdd(21) provide F₂₈ and F₂₁ to the output bus. During cycle 67, OAdd(42) and OAdd(35) provide F₄₂ and F₃₅ to the output bus. During cycle 68, OAdd(56) and OAdd(49) provide F₅₆ and F₄₉ to the output bus. During cycle 69, OAdd(57) and OAdd(50) provide F₅₇ and F₅₀ to the output bus. During cycle 70, OAdd (43) and OAdd (36) provide F₄₃ and F₃₆ to the output bus. During cycle 71, OAdd(29) and OAdd(22) provide F₂₉ and F₂₂ to the output bus. During cycle 72, OAdd(23) and OAdd(15) provide F₂₃ and F₁₅ to the output bus. During cycle 73, OAdd (37) and OAdd (30) provide F₃₇ and F₃₀ to the output bus. During cycle 74, OAdd(51) and OAdd(44) provide F₅₁ and F₄₄ to the output bus. During cycle 75, OAdd (59) and OAdd (58) provide F₅₉ and F₅₈ to the output bus. During cycle 76, OAdd(52) and OAdd(45) provide F₅₂ and F₄₅ to the output bus. During cycle 77, OAdd(38) and OAdd(31) provide F₃₈ and F₃₁ to the output bus. During cycle 78, OAdd(46) and OAdd(39) provide F₄₆ and F₃₉ to the output bus. During cycle 79, OAdd(60) and OAdd(53) provide F₆₀ and F₅₃ to the output bus. During cycle 80, OAdd(61) and OAdd(54) provide F₆₁ and F₅₄ to the output bus. During cycle 81, OAdd(55) and OAdd(47) provide F₅₅ and F₄₇ to the output bus. During cycle 82, OAdd(63) and OAdd(62) provide F₆₃ and F₆₂ to the output bus.

DCT engine 20 uses pipelining so that more than one matrix is processed every 82 clock cycles. As many as three different matrices may be in different stages of transformation simultaneously. More specifically, because two pixels of data are received every cycle, at cycle 33 the first matrix of data is completely input and input accumulators 22 have completed the calculations based upon this data. Accordingly, the input of the second matrix of data may begin. Additionally, at cycle 65, the second matrix is completely input, the input accumulators 22 have completed the calculations based upon the second matrix of data, and the multiplication circuits have completed the calculations based upon the first matrix of data. Accordingly, the input of the third matrix may begin and the second matrix of data is provided to the multiplication circuits.

The computation of the IDCT according to the present invention is set forth as follows. As with the FDCT, notation for the matrix of data is indexed as a singly indexed vector as represented by the subscripts of X in FIG. 2. The first step in the IDCT is calculating the inverse transform coefficient vectors τ₀ to τ₆₃ as follows.

τ₀ =F₀ /2+F₉ +F₁₈ +F₂₇ +F₃₆ +F₄₅ +F₅₄ +F₆₃

τ₁ =F₀ /2-F₁₃ -F₂₂ +F₂₅ -F₃₆ -F₄₇ +F₅₀ -F₅₉

τ₂ =F₀ /2-F₉ +F₁₈ -F₂₇ +F₃₆ -F₄₅ +F₅₄ -F₆₃

τ₃ =F₀ /2+F₁₃ -F₂₂ -F₂₅ -F₃₆ +F₄₇ +F₅₀ +F₅₉

τ₄ =F₀ /2+F₁₁ +F₂₂ -F₃₁ -F₃₆ -F₄₁ -F₅₀ -F₆₁

τ₅ =F₀ /2+F₁₅ -F₁₈ -F₂₉ +F₃₆ +F₄₃ -F₅₄ -F₅₇

τ₆ =F₀ /2-F₁₁ +F₂₂ +F₃₁ -F₃₆ +F₄₁ -F₅₀ +F₆₁

τ₇ =F₀ /2-F₁₅ -F₁₈ +F₂₉ +F₃₆ -F₄₃ -F₅₄ +F₅₇

τ₈ =c₄ ·F₁ +c₄ ·F₈ +F₁₀ +F₁₇ +F₁₉ +F₂₆ +F₂₈ +F₃₅ +F₃₇ +F₄₄ +F₄₆ +F₅₃ +F₅₅ +F₆₂

τ₉ =-c₄ ·F₅ +c₄ ·F₈ -F₁₄ +F₁₇ -F₂₁ -F₂₈ -F₃₀ +F₃₃ -F₃₉ +F₄₂ -F₄₄ -F₅₁ -F₅₅ +F₅₈

τ₁₀ =-c₄ ·F₆ +F₉ -F₁₃ +c₄ ·F₁₆ -F₂₀ -F₂₉ -F₃₁ +F₃₄ -F₃₈ +F₄₁ -F₄₃ -F₅₂ +F₅₉ -F₆₃

τ₁₁ =c₄ ·F₁ -F₁₂ -F₁₄ -F₂₁ -F₂₃ +c₄ ·F₂₄ +F₂₆ -F₃₅ -F₃₇ -F₄₆ +F₄₉ +F₅₁ -F₅₈ -F₆₀

τ₁₂ =-c₄ ·F₄ +F₉ -F₁₅ +F₁₈ -F₂₂ -F₂₇ -F₂₉ +c₄ ·F₃₂ +F₄₃ F₄₅ -F₅₀ -F₅₄ +F₅₇ +F₆₃

τ₁₃ -c₄ ·F₇ +F₁₀ -F₁₂ +F₁₇ -F₁₉ -F₃₀ +F₃₅ -F₃₇ +c₄ ·F₄₀ -F₄₂ -F₅₃ +F₅₅ +F₆₀ -F₆₂

τ₁₄ =c₄ ·F₂ -F₁₁ -F₁₅ -F₂₀ +F₂₅ +F₂₇ -F₃₄ -F₃₈ -F₄₅ +F₄₇ +c₄ ·F₄₈ +F₅₂ -F₅₇ -F₆₁

τ₁₅ =-c₄ ·F₃ +F₁₀ +F₁₉ -F₂₃ -F₂₆ -F₂₈ +F₃₃ +F₃₉ +F₄₄ -F₄₆ -F₄₉ -F₅₃ +c₄ ·F₅₆ +F₆₂

τ₁₆ =c₄ ·F₇ +F₁₄ +F₂₁ -F₂₃ +F₂₈ -F₃₀ +F₃₅ -F₃₇ +F₄₂ -F₄₄ +F₄₉ -F₅₁ +c₄ ·F₅₆ -F₅₈

τ₁₇ -c₄ ·F₃ +c₄ ·F₁₀ +F₁₄ -F₁₉ +F₂₃ -F₂₈ +F₃₀ +F₃₃ +F₃₉ -F₄₂ -F₄₄ +F₄₉ +F₅₃ -F₅₈

τ₁₈ =-c₄ ·F₆ -F₁₁ +F₁₅ +c₄ ·F₁₆ -F₂₀ +F₂₅ -F₂₇ -F₃₄ +F₃₈ +F₄₅ +F₄₇ -F₅₂ +F₅₇ -F₆₁

τ₁₉ -c₄ ·F₇ -F₁₂ +F₁₄ +F₁₇ -F₁₉ +c₄ ·F₂₄ -F₂₆ -F₃₅ +F₃₇ +F₄₆ -F₅₃ +F₅₅ +F₅₈ +F₆₀

τ₂₀ =-c₄ ·F₄ +F₉ +F₁₅ -F₁₈ +F₂₂ -F₂₇ +F₂₉ +c₄ ·F₃₂ -F₄₃ -F₄₅ +F₅₀ +F₅₄ -F₅₇ +F₆₃

τ₂₁ =c₄ ·F₁ -F₁₀ -F₁₂ +F₂₁ +F₂₃ +F₃₀ -F₃₅ -F₃₇ +c₄ ·F₄₀ +F₄₂ -F₄₉ -F₅₁ +F₆₀ +F₆₂

τ₂₂ =-c₄ ·F₂ +F₉ +F₁₃ -F₂₀ -F₂₉ +F₃₁ +F₃₄ +F₃₈ -F₄₁ -F₄₃ +c₄ ·F₄₈ +F₅₂ -F₅₉ -F₆₃

τ₂₃ =+c₄ ·F₅ -F₁₀ +F₁₇ -F₂₁ +F₂₆ -F₂₈ -F₃₃ +F₃₉ +F₄₄ +F₄₆ -F₅₁ -F₅₅ +c₄ ·F₅₆ -F₆₂

τ₂₄ =c₄ ·F₆ +F₁₃ +F₁₅ +F₂₀ +F₂₇ -F₃₁ +F₃₄ -F₃₈ +F₄₁ -F₄₅ +c₄ ·F₄₈ -F₅₂ +F₅₇ -F₅₉

τ₂₅ =-c₄ ·F₇ +c₄ ·F₈ -F₁₀ +F₂₁ -F₂₃ -F₂₆ +F₂₈ -F₃₅ +F₃₇ +F₄₄ -F₄₆ +F₄₉ -F₅₁ -F₆₂

τ₂₆ =-c₄ ·F₂ +F₁₃ -F₁₅ +c₄ ·F₁₆ +F₂₀ -F₂₇ -F₃₁ -F₃₄ -F₃₈ +F₄₁ +F₄₅ +F₅₂ -F₅₇ -F₅₉

τ₂₇ =c₄ ·F₅ -F₁₀ +F₁₂ -F₁₉ -F₂₃ +c₄ ·F₂₄ -F₃₀ +F₃₃ -F₃₉ -F₄₂ -F₄₉ +F₅₃ +F₆₀ +F₆₂

τ₂₈ =c₄ ·F₄ -F₁₁ +F₁₃ -F₁₈ -F₂₂ +F₂₅ -F₃₁ +c₄ ·F₃₂ -F₄₁ -F₄₇ -F₅₀ +F₅₄ +F₅₉ +F₆₁

τ₂₉ =-c₄ ·F₃ +F₁₂ -F₁₄ +F₁₇ +F₂₁ -F₂₆ -F₃₃ -F₃₉ +c₄ ·F₄₀ +F₄₆ +F₅₁ -F₅₅ -F₅₈ -F₆₀

τ₃₀ =-c₄ ·F₆ +F₉ -F₁₁ +F₂₀ -F₂₅ +F₂₉ -F₃₄ +F₃₈ +F₄₃ -F₄₇ +c₄ ·F₄₈ -F₅₂ -F₆₁ -F₆₃

τ₃₁ =c₄ ·F₁ -F₁₄ -F₁₇ -F₁₉ +F₂₈ +F₃₀ +F₃₅ +F₃₇ -F₄₂ -F₄₄ -F₅₃ -F₅₅ +c₄ ·F₅₆ +F₅₈

τ₃₂ =c₄ ·F₅ +F₁₂ +F₁₄ +F₁₉ +F₂₃ +F₂₆ +F₃₃ -F₃₉ +c₄ ·F₄₀ -F₄₆ +F₄₉ -F₅₃ +F₅₈ -F₆₀

τ₃₃ =-c₄ ·F₇ +c₄ ·F₈ -F₁₀ -F₂₁ +F₂₃ -F₂₆ +F₂₈ +F₃₅ -F₃₇ +F₄₄ -F₄₆ -F₄₉ +F₅₁ -F₆₂

τ₃₄ -c₄ ·F₂ -F₁₃ +F₁₅ +c₄ ·F₁₆ +F₂₀ +F₂₇ +F₃₁ -F₃₄ -F₃₈ -F₄₁ -F₄₅ +F₅₂ +F₅₇ +F₅₉

τ₃₅ =-c₄ ·F₅ -F₁₀ +F₁₂ +F₁₉ +F₂₃ +c₄ ·F₂₄ -F₃₀ F₃₃ +F₃₉ -F₄₂ +F₄₉ -F₅₃ +F₆₀ +F₆₂

τ₃₆ =c₄ ·F₄ +F₁₁ -F₁₃ -F₁₈ -F₂₂ -F₂₅ +F₃₁ +c₄ ·F₃₂ +F₄₁ +F₄₇ -F₅₀ +F₅₄ -F₅₉ -F₆₁

τ₃₇ =c₄ ·F₃ +F₁₂ -F₁₄ -F₁₇ -F₂₁ -F₂₆ +F₃₃ +F₃₉ +c₄ ·F₄₀ +F₄₆ -F₅₁ +F₅₅ -F₅₈ -F₆₀

τ₃₈ =-c₄ ·F₆ -F₉ +F₁₁ +F₂₀ +F₂₅ -F₂₉ -F₃₄ +F₃₈ -F₄₃ +F₄₇ +c₄ ·F₄₈ -F₅₂ +F₆₁ +F₆₃

τ₃₉ =-c₄ ·F₁ -F₁₄ +F₁₇ +F₁₉ +F₂₈ +F₃₀ -F₃₅ -F₃₇ -F₄₂ -F₄₄ +F₅₃ +F₅₅ +c₄ ·F₅₆ +F₅₈

τ₄₀ =c₄ ·F₄ +F₁₁ +F₁₃ +F₁₈ +F₂₂ +F₂₅ +F₃₁ +c₄ ·F₃₂ +F₄₁ -F₄₇ +F₅₀ -F₅₄ +F₅₉ -F₆₁

τ₄₁ =c₄ ·F₃ +c₄ ·F₈ +F₁₄ +F₁₉ -F₂₃ -F₂₈ +F₃₀ -F₃₃ -F₃₉ -F₄₂ -F₄₄ -F₄₉ -F₅₃ -F₅₈

τ₄₂ =c₄ ·F₆ +F₁₁ -F₁₅ +c₄ ·F₁₆ -F₂₀ -F₂₅ +F₂₇ -F₃₄ +F₃₈ -F₄₅ -F₄₇ -F₅₂ -F₅₇ +F₆₁

τ₄₃ =-c₄ ·F₇ -F₁₂ +F₁₄ -F₁₇ +F₁₉ +c₄ ·F₂₄ -F₂₆ +F₃₅ -F₃₇ +F₄₆ +F₅₃ -F₅₅ +F₅₈ -F₆₀

τ₄₄ =-c₄ ·F₄ -F₉ -F₁₅ -F₁₈ +F₂₂ +F₂₇ -F₂₉ +c₄ ·F₃₂ +F₄₃ +F₄₅ +F₅₀ +F₅₄ +F₅₇ +F₆₃

τ₄₅ =-c₄ ·F₁ -F₁₀ -F₁₂ -F₂₁ -F₂₃ +F₃₀ +F₃₅ +F₃₇ +c₄ ·F₄₀ +F₄₂ +F₄₉ +F₅₁ +F₆₀ +F₆₂

τ₄₆ =-c₄ ·F₂ -F₉ -F₁₃ -F₂₀ +F₂₉ -F₃₁ +F₃₄ +F₃₈ +F₄₁ +F₄₃ +c₄ ·F₄₈ +F₅₂ +F₅₉ +F₆₃

τ₄₇ =-c₄ ·F₅ -F₁₀ -F₁₇ +F₂₁ +F₂₆ -F₂₈ +F₃₃ -F₃₉ +F₄₄ +F₄₆ +F₅₁ +F₅₅ +c₄ ·F₅₆ -F₆₂

τ₄₈ =c₄ ·F₃ +F₁₀ +F₁₂ +F₁₇ +F₂₁ +c₄ ·F₂₄ +F₃₀ +F₃₃ +F₃₉ +F₄₂ +F₅₁ -F₅₅ +F₆₀ -F₆₂

τ₄₉ =c₄ ·F₅ +c₄ ·F₈ -F₁₄ -F₁₇ +F₂₁ -F₂₈ -F₃₀ -F₃₃ +F₃₉ +F₄₂ -F₄₄ +F₅₁ +F₅₅ +F₅₈

τ₅₀ =-c₄ ·F₆ -F₉ +F₁₃ +c₄ ·F₁₆ -F₂₀ +F₂₉ +F₃₁ +F₃₄ -F₃₈ -F₄₁ +F₄₃ -F₅₂ -F₅₉ +F₆₃

τ₅₁ =-c₄ ·F₁ -F₁₂ -F₁₄ +F₂₁ +F₂₃ +c₄ ·F₂₄ +F₂₆ +F₃₅ +F₃₇ -F₄₆ F₄₉ -F₅₁ -F₅₈ -F₆₀

τ₅₂ =-c₄ ·F₄ -F₉ +F₁₅ +F₁₈ -F₂₂ +F₂₇ +F₂₉ +c₄ ·F₃₂ -F₄₃ +F₄₅ -F₅₀ -F₅₄ -F₅₇ -F₆₃

τ₅₃ =c₄ ·F₇ +F₁₀ -F₁₂ -F₁₇ +F₁₉ -F₃₀ -F₃₅ +F₃₇ +c₄ ·F₄₀ -F₄₂ +F₅₃ -F₅₅ +F₆₀ -F₆₂

τ₅₄ =c₄ ·F₂ +F₁₁ +F₁₅ -F₂₀ -F₂₅ -F₂₇ -F₃₄ -F₃₈ +F₄₅ -F₄₇ +c₄ ·F₄₈ +F₅₂ +F₅₇ +F₆₁

τ₅₅ =c₄ ·F₃ +F₁₀ -F₁₉ +F₂₃ -F₂₆ -F₂₈ -F₃₃ -F₃₉ +F₄₄ -F₄₆ +F₄₉ +F₅₃ +c₄ ·F₅₆ +F₆₂

τ₅₆ =c₄ ·F₂ +F₉ +F₁₁ +c₄ ·F₁₆ +F₂₀ +F₂₅ +F₂₉ +F₃₄ +F₃₈ +F₄₃ +F₄₇ +F₅₂ +F₆₁ -F₆₃

τ₅₇ =-c₄ ·F₁ +c₄ ·F₈ +F₁₀ -F₁₇ -F₁₉ +F₂₆ +F₂₈ -F₃₅ -F₃₇ +F₄₄ +F₄₆ -F₅₃ -F₅₅ +F₆₂

τ₅₈ =c₄ ·F₂ -F₉ -F₁₁ +c₄ ·F₁₆ +F₂₀ -F₂₅ -F₂₉ +F₃₄ +F₃₈ -F₄₃ -F₄₇ +F₅₂ -F₆₁ +F₆₃

τ₅₉ =-c₄ ·F₃ +F₁₀ +F₁₂ -F₁₇ -F₂₁ +c₄ ·F₂₄ +F₃₀ -F₃₃ -F₃₉ +F₄₂ -F₅₁ +F₅₅ +F₆₀ -F₆₂

τ₆₀ =c₄ ·F₄ -F₁₁ -F₁₃ +F₁₈ +F₂₂ -F₂₅ -F₃₁ +c₄ ·F₃₂ -F₄₁ +F₄₇ +F₅₀ -F₅₄ -F₅₉ +F₆₁

τ₆₁ =c₄ ·F₆ -F₁₃ -F₁₅ +F₂₀ -F₂₇ +F₃₁ +F₃₄ -F₃₈ -F₄₁ +F₄₅ +c₄ ·F₄₈ -F₅₂ -F₅₇ +F₅₉

τ₆₂ =-c₄ ·F₅ +F₁₂ +F₁₄ -F₁₉ -F₂₃ +F₂₆ -F₃₃ +F₃₉ +c₄ ·F₄₀ -F₄₆ -F₄₉ +F₅₃ +F₅₈ -F₆₀

τ₆₃ =-c₄ ·F₇ +F₁₄ -F₂₁ +F₂₃ +F₂₈ -F₃₀ -F₃₅ +F₃₇ +F₄₂ -F₄₄ -F₄₉ +F₅₁ +c₄ ·F₅₆ -F₅₈

After the inverse transform coefficient vectors are generated, these coefficient vectors are used to generate the IDCT matrix as follows.

f₀ =c₀ ·τ₀ +c₁ ·τ₈ +c₂ ·τ₅₆ +c₃ ·τ₄₈ +c₄ ·τ₄₀ +c₅ ·τ₃₂ +c₆ ·τ₂₄ +c₇ ·τ₁₆

f₁ =c₀ ·τ₁ +c₁ ·τ₉ +c₂ ·τ₁₀ +c₃ ·τ₁₁ +c₄ ·τ₁₂ +c₅ ·τ₁₃ +c₆ ·τ₁₄ +c₇ ·τ₁₅

f₂ =c₀ ·τ₆ +c₁ ·τ₁₇ +c₂ ·τ₁₈ +c₃ ·τ₁₉ +c₄ ·τ₂₀ +c₅ ·τ₂₁ +c₆ ·τ₂₂ +c₇ ·τ₂₃

f₃ =c₀ ·τ₇ +c₁ ·τ₂₅ +c₂ ·τ₂₆ +c₃ ·τ₂₇ +c₄ ·τ₂₈ +c₅ ·τ₂₉ +c₆ ·τ₃₀ +c₇ ·τ₃₁

f₄ =c₀ ·τ₅ +c₁ ·τ₃₃ +c₂ ·τ₃₄ +c₃ ·τ₃₅ +c₄ ·τ₃₆ +c₅ ·τ₃₇ +c₆ ·τ₃₈ +c₇ ·τ₃₉

f₅ =c₀ ·τ₄ +c₁ ·τ₄₁ +c₂ ·τ₄₂ +c₃ ·τ₄₃ +c₄ ·τ₄₄ +c₅ ·τ₄₅ +c₆ ·τ₄₆ +c₇ +τ₄₇

f₆ =c₀ ·τ₃ +c₁ ·τ₄₉ +c₂ ·τ₅₀ +c₃ ·τ₅₁ +c₄ ·τ₅₂ +c₅ ·τ₅₃ +c₆ ·τ₅₄ +c₇ ·τ₅₅

f₇ =c₀ ·τ₂ +c₁ ·τ₅₇ +c₂ ·τ₅₈ +c₃ ·τ₅₉ +c₄ ·τ₆₀ +c₅ ·τ₆₂ +c₆ ·τ₆₁ +c₇ ·τ₆₃

f₈ =c₀ ·τ₄ -c₁ ·τ₄₅ -c₂ ·τ₄₆ +c₃ ·τ₄₁ -c₄ ·τ₄₄ -c₅ ·τ₄₇ +c₆ ·τ₄₂ -c₇ ·τ₄₃

f₉ =c₀ ·τ₀ -c₁ ·τ₃₂ -c₂ ·τ₂₄ +c₃ ·τ₈ -c₄ ·τ₄₀ -c₅ ·τ₁₆ +c₆ ·τ₅₆ -c₇ ·τ₄₈

f₁₀ =c₀ ·τ₅ -c₁ ·τ₃₇ -c₂ ·τ₃₈ +c₃ ·τ₃₃ -c₄ ·τ₃₆ -c₅ ·τ₃₉ +c₆ ·τ₃₄ -c₇ ·τ₃₅

f₁₁ =c₀ ·τ₃ -c₁ ·τ₅₃ -c₂ ·τ₅₄ +c₃ ·τ₄₉ -c₄ ·τ₅₂ -c₅ ·τ₅₅ +c₆ ·τ₅₀ -c₇ ·τ₅₁

f₁₂ =c₀ ·τ₁ -c₁ ·τ₁₃ -c₂ ·τ₁₄ +c₃ ·τ₉ -c₄ ·τ₁₂ -c₅ ·τ₁₅ +c₆ +τ₁₀ -c₇ ·τ₁₁

f₁₃ =c₀ ·τ₇ -c₁ ·τ₂₉ -c₂ ·τ₃₀ +c₃ ·τ₂₅ -c₄ ·τ₂₈ -c₅ -τ₃₁ +c₆ ·τ₂₆ -c₇ ·τ₂₇

f₁₄ =c₀ ·τ₂ -c₁ ·τ₆₂ -c₂ ·τ₆₁ +c₃ ·τ₅₇ -c₄ ·τ₆₀ -c₅ ·τ₆₃ +c₆ ·τ₅₈ -c₇ ·τ₅₉

f₁₅ =c₀ ·τ₆ -c₁ ·τ₂₁ -c₂ ·τ₂₂ +c₃ ·τ₁₇ -c₄ ·τ₂₀ -c₅ ·τ₂₃ +c₆ ·τ₁₈ -c₇ ·τ₁₉

f₁₆ =c₀ ·τ₃ -c₁ ·τ₅₁ +c₂ ·τ₅₄ +c₃ ·τ₅₅ -c₄ ·τ₅₂ +c₅ ·τ₄₉ -c₆ ·τ₅₀ +c₇ ·τ₅₃

f₁₇ =c₀ ·τ₇ -c₁ ·τ₂₇ +c₂ ·τ₃₀ +c₃ ·τ₃₁ -c₄ ·τ₂₈ +c₅ ·τ₂₅ -c₆ ·τ₂₆ +c₇ ·τ₂₉

f₁₈ =c₀ ·τ₀ -c₁ ·τ₄₈ +c₂ ·τ₂₄ +c₃ ·τ₁₆ -c₄ ·τ₄₀ +c₅ ·τ₈ -c₆ ·τ₅₆ +c₇ ·τ₃₂

f₁₉ =c₀ ·τ₆ -c₁ ·τ₁₉ +c₂ ·τ₂₂ +c₃ ·τ₂₃ -c₄ ·τ₂₀ +c₅ ·τ₁₇ -c₆ ·τ₁₈ +c₇ ·τ₂₁

f₂₀ =c₀ ·τ₄ -c₁ ·τ₄₃ +c₂ ·τ₄₆ +c₃ ·τ₄₇ -c₄ ·τ₄₄ +c₅ ·τ₄₁ -c₆ ·τ₄₂ +c₇ ·τ₄₅

f₂₁ =c₀ ·τ₂ -c₁ ·τ₅₉ +c₂ ·τ₆₁ +c₃ ·τ₆₃ -c₄ ·τ₆₀ +c₅ ·τ₅₇ -c₆ ·τ₅₈ +c₇ ·τ₆₂

f₂₂ =c₀ ·τ₅ -c₁ ·τ₃₅ +c₂ ·τ₃₈ +c₃ ·τ₃₉ -c₄ ·τ₃₆ +c₅ ·τ₃₃ -c₆ ·τ₃₄ +c₇ ·τ₃₇

f₂₃ =c₀ ·τ₁ -c₁ ·τ₁₁ +c₂ ·τ₁₄ +c₃ ·τ₁₅ -c₄ ·τ₁₂ +c₅ ·τ₉ -c₆ ·τ₁₀ +c₇ ·τ₁₃

f₂₄ =c₀ ·τ₅ -c₁ ·τ₃₉ -c₂ ·τ₃₄ +c₃ ·τ₃₇ +c₄ ·τ₃₆ -c₅ ·τ₃₅ -c₆ ·τ₃₈ +c₇ ·τ₃₃

f₂₅ =c₀ ·τ₆ -c₁ ·τ₂₃ -c₂ ·τ₁₈ +c₃ ·τ₂₁ +c₄ ·τ₂₀ -c₅ ·τ₁₉ -c₆ ·τ₅₄ +c₇ ·τ₁₇

f₂₆ =c₀ ·τ₃ -c₁ ·τ₅₅ -c₂ ·τ₅₀ +c₃ ·τ₅₃ +c₄ ·τ₅₂ -c₅ ·τ₅₁ -c₆ ·τ₅₄ +c₇ ·τ₄₉

f₂₇ =c₀ ·τ₀ -c₁ ·τ₁₆ -c₂ ·τ₅₆ +c₃ ·τ₃₂ +c₄ ·τ₄₀ -c₅ ·τ₄₈ -c₆ ·τ₂₄ +c₇ ·τ₈

f₂₈ =c₀ ·τ₂ -c₁ ·τ₆₃ -c₂ ·τ₅₈ +c₃ ·τ₆₂ +c₄ ·τ₆₀ -c₅ ·τ₅₉ -c₆ ·τ₆₁ +c₇ +τ₅₇

f₂₉ =c₀ ·τ₁ -c₁ ·τ₁₅ -c₂ ·τ₁₀ +c₃ ·τ₁₃ +c₄ ·τ₁₂ -c₅ ·τ₁₁ -c₆ ·τ₁₄ +c₇ ·τ₉

f₃₀ =c₀ ·τ₄ -c₁ ·τ₄₇ -c₂ ·τ₄₂ +c₃ ·τ₄₅ +c₄ ·τ₄₄ -c₅ ·τ₄₃ -c₆ ·τ₄₆ +c₇ ·τ₄₁

f₃₁ =c₀ ·τ₇ -c₁ ·τ₃₁ -c₂ ·τ₂₆ +c₃ ·τ₂₉ +c₄ ·τ₂₈ -c₅ ·τ₂₇ -c₆ ·τ₃₀ +c₇ ·τ₂₅

f₃₂ =c₀ ·τ₇ +c₁ ·τ₃₁ -c₂ ·τ₂₆ +c₃ ·τ₂₉ +c₄ ·τ₂₈ +c₅ ·τ₂₇ -c₆ ·τ₃₀ -c₇ ·τ₂₅

f₃₃ =c₀ ·τ₄ +c₁ ·τ₄₇ -c₂ ·τ₄₂ +c₃ ·τ₄₅ +c₄ ·τ₄₄ +c₅ ·τ₄₃ -c₆ ·τ₄₆ -c₇ ·τ₄₁

f₃₄ =c₀ ·τ₁ +c₁ ·τ₁₅ -c₂ ·τ₁₀ +c₃ ·τ₁₃ +c₄ ·τ₁₂ +c₅ ·τ₁₁ -c₆ ·τ₁₄ -c₇ ·τ₉

f₃₅ =c₀ ·τ₂ +c₁ ·τ₆₃ -c₂ ·τ₅₈ +c₃ ·τ₆₂ +c₄ ·τ₆₀ +c₅ ·τ₅₉ -c₆ ·τ₆₁ -c₇ ·τ₅₇

f₃₆ =c₀ ·τ₀ +c₁ ·τ₁₆ -c₂ ·τ₅₆ +c₃ ·τ₃₂ +c₄ ·τ₄₀ +c₅ ·τ₄₈ -c₆ ·τ₂₄ -c₇ ·τ₈

f₃₇ =c₀ ·τ₃ +c₁ ·τ₅₅ -c₂ ·τ₅₀ +c₃ ·τ₅₃ +c₄ ·τ₅₂ +c₅ ·τ₅₁ -c₆ ·τ₅₄ -c₇ ·τ₄₉

f₃₈ =c₀ ·τ₆ +c₁ ·τ₂₃ -c₂ ·τ₁₈ +c₃ ·τ₂₁ +c₄ ·τ₂₀ +c₅ ·τ₁₉ -c₆ ·τ₂₂ -c₇ ·τ₁₇

f₃₉ =c₀ ·τ₅ +c₁ ·τ₃₉ -c₂ ·τ₃₄ +c₃ ·τ₃₇ +c₄ ·τ₃₆ +c₅ ·τ₃₅ -c₆ ·τ₃₈ -c₇ ·τ₃₃

f₄₀ =c₀ ·τ₁ +c₁ ·τ₁₁ +c₂ ·τ₁₄ +c₃ ·τ₁₅ -c₄ ·τ₁₂ +c₅ ·τ₉ -c₆ ·τ₁₀ -c₇ ·τ₁₃

f₄₁ =c₀ ·τ₅ +c₁ ·τ₃₅ +c₂ ·τ₃₈ +c₃ ·τ₃₉ -c₄ ·τ₃₆ -c₅ ·τ₃₃ -c₆ ·τ₃₄ -c₇ ·τ₃₇

f₄₂ =c₀ ·τ₂ +c₁ ·τ₅₉ +c₂ ·τ₆₁ +c₃ ·τ₆₃ -c₄ ·τ₆₀ -c₅ ·τ₅₇ -c₆ ·τ₅₈ -c₇ ·τ₆₂

f₄₃ =c₀ ·τ₄ +c₁ ·τ₄₃ +c₂ ·τ₄₆ +c₃ ·τ₄₇ -c₄ ·τ₄₄ -c₅ ·τ₄₁ -c₆ ·τ₄₂ -c₇ ·τ₄₅

f₄₄ =c₀ ·τ₆ +c₁ ·τ₁₉ +c₂ ·τ₂₂ +c₃ ·τ₂₃ -c₄ ·τ₂₀ -c₅ ·τ₁₇ -c₆ ·τ₁₈ +c₇ ·τ₂₁

f₄₅ =c₀ ·τ₀ +c₁ ·τ₄₈ +c₂ ·τ₂₄ +c₃ ·τ₁₆ -c₄ ·τ₄₀ -c₅ ·τ₈ -c₆ ·τ₅₆ -c₇ ·τ₃₂

f₄₆ =c₀ ·τ₇ +c₁ ·τ₂₇ +c₂ ·τ₃₀ +c₃ ·τ₃₁ -c₄ ·τ₂₈ -c₅ ·τ₂₅ -c₆ ·τ₂₆ -c₇ ·τ₂₉

f₄₇ =c₀ ·τ₃ +c₁ ·τ₅₁ +c₂ ·τ₅₄ +c₃ ·τ₅₅ -c₄ ·τ₅₂ -c₅ ·τ₄₉ -c₆ ·τ₅₀ -c₇ ·τ₅₃

f₄₈ =c₀ ·τ₆ +c₁ ·τ₂₁ -c₂ ·τ₂₂ +c₃ ·τ₁₇ -c₄ ·τ₂₀ +c₅ ·τ₂₃ +c₆ ·τ₁₈ +c₇ ·τ₁₉

f₄₉ =c₀ ·τ₂ +c₁ ·τ₆₂ -c₂ ·τ₆₁ +c₃ ·τ₅₇ -c₄ ·τ₆₀ +c₅ ·τ₆₃ +c₆ ·τ₅₈ +c₇ ·τ₅₉

f₅₀ =c₀ ·τ₇ +c₁ ·τ₂₉ -c₂ ·τ₃₀ +c₃ ·τ₂₅ -c₄ ·τ₂₈ -c₅ ·τ₃₁ +c₆ ·τ₃₀ +c₇ ·τ₂₇

f₅₁ =c₀ ·τ₁ +c₁ ·τ₁₃ -c₂ ·τ₁₄ +c₃ ·τ₉ -c₄ ·τ₁₂ +c₅ ·τ₁₅ +c₆ ·τ₁₀ +c₇ ·τ₁₁

f₅₂ =c₀ ·τ₃ +c₁ ·τ₅₃ -c₂ ·τ₅₄ +c₃ ·τ₄₉ -c₄ ·τ₅₂ +c₅ ·τ₅₅ +c₆ ·τ₅₀ +c₇ ·τ₅₁

f₅₃ =c₀ ·τ₅ +c₁ ·τ₃₇ -c₂ ·τ₃₈ +c₃ ·τ₃₃ -c₄ ·τ₃₆ +c₅ ·τ₃₉ +c₆ ·τ₃₄ +c₇ ·τ₃₅

f₅₄ =c₀ ·τ₀ +c₁ ·τ₃₂ -c₂ ·τ₂₄ +c₃ ·τ₈ -c₄ ·τ₄₀ +c₅ ·τ₁₆ +c₆ ·τ₅₆ +c₇ ·τ₄₈

f₅₅ =c₀ ·τ₄ +c₁ ·τ₄₅ -c₂ ·τ₄₆ +c₃ ·τ₄₁ -c₄ ·τ₄₄ +c₅ ·τ₄₇ +c₆ ·τ₄₂ +c₇ ·τ₄₃

f₅₆ =c₀ ·τ₂ -c₁ ·τ₅₇ +c₂ ·τ₅₈ +c₃ ·τ₅₉ +c₄ ·τ₆₀ -c₅ ·τ₆₂ +c₆ ·τ₆₁ -c₇ ·τ₆₃

f₅₇ =c₀ ·τ₃ -c₁ ·τ₄₉ +c₂ ·τ₅₀ +c₃ ·τ₅₁ +c₄ ·τ₅₂ -c₅ ·τ₅₃ +c₆ ·τ₅₄ -c₇ ·τ₅₅

f₅₈ =c₀ ·τ₄ -c₁ ·τ₄₁ +c₂ ·τ₄₂ +c₃ ·τ₄₃ +c₄ ·τ₄₄ -c₅ ·τ₄₅ +c₆ ·τ₄₆ -c₇ ·τ₄₇

f₅₉ =c₀ ·τ₅ -c₁ ·τ₃₃ +c₂ ·τ₃₄ +c₃ ·τ₃₅ +c₄ ·τ₃₆ -c₅ ·τ₃₇ +c₆ ·τ₃₈ -c₇ ·τ₃₉

f₆₀ =c₀ ·τ₇ -c₁ ·τ₂₅ +c₂ ·τ₂₆ +c₃ ·τ₂₇ +c₄ ·τ₂₈ -c₅ ·τ₂₉ +c₆ ·τ₃₀ -c₇ ·τ₃₁

f₆₁ =c₀ ·τ₆ -c₁ ·τ₁₇ +c₂ ·τ₁₈ +c₃ ·τ₁₉ +c₄ ·τ₂₀ -c₅ ·τ₂₁ +c₆ ·τ₂₂ -c₇ ·τ₂₃

f₆₂ =c₀ ·τ₁ -c₁ ·τ₉ +c₂ ·τ₁₀ +c₃ ·τ₁₁ +c₄ ·τ₁₂ -c₅ ·τ₁₃ +c₆ ·τ₁₄ -c₇ ·τ₁₅

f₆₃ =c₀ ·τ₀ -c₁ ·τ₈ +c₂ ·τ₅₆ +c₃ ·τ₄₈ +c₄ ·τ₄₀ -c₅ ·τ₃₂ +c₆ ·τ₂₄ -c₇ ·τ₁₆

More specifically, DCT engine 20 uses accumulators 22, multipliers 24 and output accumulators 26 to perform the many calculations of an IDCT in parallel. These calculations are divided into discrete calculations which are performed during different cycles. It takes a total of 86 cycles to completely transform an 8×8 matrix from the time the first location of input data is input until the last location of transformed data is output by DCT engine 20. Individual input accumulators 22 and output accumulators 26 maintain running totals as each new entry is accumulated during a cycle.

IDCT engine includes a constant multiplier at the input bus which multiplies F₁, F₂, F₃, F₄, F₅, F₆, F₇, F₈, F₁₆, F₂₄, F₃₂, F₄₀, F₄₈, and F₅₆ by c₄ and which divides F₀ by 2. Additionally, RMULT(0) to RMULT(7) multiply by c₀ to c₇ rather than by r₀ to r₇. I.e., RMULT(i) is a duplicate of CMULT(i). In operation, RMULT(0), RMULT(2), RMULT(4) and RMULT(6) are not used during the IDCT. Accordingly, if the DCT engine only performs IDCTs, then Rmult(0), Rmult (2), Rmult (4) and Rmult (6) can be omitted from the engine. The individual cycles of the IDCT will now be discussed in detail.

During cycle in F₁ and F₀ are provided to the input bus and F₁ is multiplied by c₄. Additionally, IAdd(0), IAdd (1), IAdd (2), IAdd (3), IAdd (4), IAdd (5), IAdd (6), IAdd(7) all separately add F₀ /2.

During cycle 2, F₁₆ and F₈ are provided to the input bus, F₈ is multiplied by c₄, and F₁₆ is multiplied by c₄. Additionally, IAdd(8), IAdd(11), IAdd(21), IAdd(31) all separately add c₄ F₁ and IAdd(39), IAdd(45), IAdd(51), IAdd (57) all separately subtract c₄ F₁.

During cycle 3, F₉ and F₂ are provided to the input bus and F₂ is multiplied by c₄. Additionally, IAdd(8), IAdd (9), IAdd (17), IAdd (25), IAdd (33), IAdd (41), IAdd(49), IAdd(57) all separately add c₄ F₈, IAdd(0), IAdd(10), IAdd(12), IAdd(20), IAdd(22), IAdd(30), IAdd(56) all separately add F₉, IAdd(2), IAdd(38), IAdd(44), IAdd(46), IAdd(50), IAdd(52), IAdd(58) all separately subtract F₉ and IAdd(10), IAdd(18), IAdd(26), IAdd(34), IAdd(42), IAdd(50), IAdd(56), IAdd(58) all separately add c₄ F₁₆.

During cycle 4, F₃ and F₁₀ are provided to the input bus and F₃ is multiplied by c₄. Additionally, IAdd (14), IAdd(54), IAdd(56), IAdd(58) all separately add c₄ F₂, IAdd(22), IAdd(26), IAdd(34), IAdd(46) all separately subtract c₄ F₂, IAdd(8), IAdd(13), IAdd(15), IAdd(48), IAdd (53), IAdd (55), IAdd (57), IAdd (59) all separately add F₁₀, and IAdd(21), IAdd(23), IAdd(25), IAdd(27), IAdd(33), IAdd(35), IAdd(45), IAdd(47) all separately subtract F₁₀.

During cycle 5, F₁₇ and F₂₄ are provided to the input bus and F₂₄ is multiplied by c₄. Additionally, IAdd(37), IAdd(41), IAdd(48), IAdd(55) all separately add c₄ F₃, IAdd (15), IAdd (17), IAdd (29), IAdd (59) all separately subtract c₄ F₃, IAdd (8), IAdd (9), IAdd (13), IAdd (19), IAdd (23), IAdd (29), IAdd (39), IAdd (48) all separately add F₁₇, and IAdd (31), IAdd (37), IAdd (43), IAdd (47), IAdd(49), IAdd(53), IAdd(57), IAdd(59) all separately subtract F₁₇.

During cycle 6 F₃₂ and F₂₅ are provided to the input bus and F₃₂ is multiplied by c₄. Additionally, IAdd(11), IAdd(19), IAdd(27), IAdd(35), IAdd(43), IAdd(48), IAdd(51), IAdd(59) all separately add c₄ F₂₄, IAdd(1), IAdd (14), IAdd (18), IAdd (28), IAdd (38), IAdd (40), IAdd(56) all separately add F₂₅, and IAdd(3), IAdd(30), IAdd(36), IAdd(42), IAdd(54), IAdd(58), IAdd(60) all separately subtract F₂₅.

During cycle 7, F₁₈ and F₁₁ are provided to the input bus. Additionally, IAdd(4), IAdd(36), IAdd(38), IAdd(40), IAdd(42), IAdd(54), IAdd(56) all separately add F₁₁, IAdd(6), IAdd(14), IAdd(18), IAdd(28), IAdd(30), IAdd(58), IAdd(60) all separately subtract F₁₁, IAdd(0), IAdd(2), IAdd(12), IAdd(40), IAdd(52), IAdd(60) all separately add F₁₈, IAdd(5), IAdd(7), IAdd(20), IAdd(28), IAdd(36), IAdd(44) all separately subtract F₁₈ and IAdd(12), IAdd(20), IAdd(28), IAdd(36), IAdd(40), IAdd(44), IAdd(52), IAdd(60) all separately add c₄ F₃₂.

During cycle 8, F₄ and F₅ are provided to the input bus, and F₄ is multiplied by c₄, and F₅ is multiplied by c₄.

During cycle 9, F₁₂ and F₁₉ are provided to the input bus. Additionally, IAdd(28), IAdd(36), IAdd(40), IAdd(60) all separately add c₄ F₄, IAdd(44), IAdd(52), IAdd(12), IAdd(20) all separately subtract c₄ F₄, IAdd(23), IAdd(27), IAdd(32), IAdd(49) all separately add c₄ F₅, IAdd(9), IAdd(35), IAdd(47), IAdd(62) all separately subtract c₄ F₅, IAdd(27), IAdd(29), IAdd(32), IAdd(35), IAdd(37), IAdd(48), IAdd(59), IAdd(62) all separately add F₁₂, IAdd(11), IAdd(13), IAdd(19), IAdd(21), IAdd(43), IAdd(45), IAdd(51), IAdd(53) all separately subtract F₁₂, IAdd(8), IAdd(15), IAdd(32), IAdd(35), IAdd(39), IAdd(41), IAdd(43), IAdd(53) all separately add F₁₉, and IAdd(13), IAdd(17), IAdd(19), IAdd(27), IAdd(31), IAdd(55), IAdd(57), IAdd(62) all separately subtract F₁₉.

During cycle 10, F₃₃ and F₂₆ are provided to the input bus. Additionally, IAdd(8), IAdd(11), IAdd(23), IAdd(32), IAdd(47), IAdd(51), IAdd(57), IAdd(62) all separately add F₂₆, IAdd(15), IAdd(19), IAdd(25), IAdd(29), IAdd(33), IAdd(37), IAdd(43), IAdd(55) all separately subtract F₂₆, IAdd(9), IAdd(15), IAdd(17), IAdd(27), IAdd(32), IAdd(37), IAdd(47), IAdd(48) all separately add F₃₃, and IAdd(23), IAdd(29), IAdd(35), IAdd (41), IAdd (49), IAdd (55), IAdd (59), IAdd (62) all separately subtract F₃₃.

During cycle 11, F₄₈ and F₄₀ are provided to the input bus, F₄₀ is multiplied by c₄, and F₄₈ is multiplied by c₄.

During cycle 12, F₃₄ and F₄₁ are provided to the input bus. Additionally, IAdd (10), IAdd (22), IAdd (2 4), IAdd(46), IAdd(50), IAdd(56), IAdd(58), IAdd(61) all separately add F₃₄, IAdd (14), IAdd (18), IAdd (26), IAdd(30), IAdd(34), IAdd(38), IAdd(42), IAdd(54) all separately subtract F₃₄, IAdd(13), IAdd(21), IAdd(29), IAdd(32), IAdd(37), IAdd(45), IAdd(53), IAdd(62) all separately add c₄ F₄₀, IAdd(6), IAdd(10), IAdd(24), IAdd(26), IAdd(36), IAdd(40), IAdd(46) all separately add F₄₁, IAdd(4), IAdd(22), IAdd(28), IAdd(34), IAdd(50), IAdd(60), IAdd(61) all separately subtract F₄₁, and IAdd(14), IAdd(22), IAdd(24), IAdd(30), IAdd(38), IAdd(46), IAdd(54), IAdd(61) all separately add c₄ F₄₈.

During cycle 13, F₂₀ and F₂₇ are provided to the input bus. Additionally, IAdd(24), IAdd(26), IAdd(30), IAdd(34), IAdd(38), IAdd(56), IAdd(58), IAdd(61) all separately add F₂₀, IAdd(10), IAdd(14), IAdd(18), IAdd(22), IAdd(42), IAdd(46), IAdd(50), IAdd(54) all separately subtract F₂₀, IAdd(0), IAdd(14), IAdd(24), IAdd(34), IAdd(42), IAdd(44), IAdd(52) all separately add F₂₇, and IAdd(2), IAdd(12), IAdd(18), IAdd(20), IAdd(26), IAdd(54), IAdd(61) all separately subtract F₂₇.

During cycle 14, F₆ and F₁₃ are provided to the input bus and F₆ is multiplied by c₄ ·Additionally, IAdd (3), IAdd (22), IAdd (24), IAdd (26), IAdd (28), IAdd (40), IAdd(50) all separately add F₁₃, and IAdd(1), IAdd(10), IAdd(34), IAdd(36), IAdd(46), IAdd(60), IAdd(61) all separately subtract F₁₃.

During cycle 15, F₁₄ and F₇ are provided to the input bus and F₇ is multiplied by c₄. Additionally, IAdd(18), IAdd(24), IAdd(42), IAdd(61) all separately add c₄ F₆, IAdd(10), IAdd(30), IAdd(38), IAdd(50) all separately subtract c₄ F₆, IAdd(16), IAdd(17), IAdd(19), IAdd(32), IAdd(41), IAdd(43), IAdd(62), IAdd(63) all separately add F₁₄, and IAdd (9), IAdd (11), IAdd (29), IAdd (31), IAdd(37), IAdd(39), IAdd(49), IAdd(51) all separately subtract F₁₄.

During cycle 16, F₂₈ and F₂₁ are provided to the input bus. Additionally, IAdd(16), IAdd(19), IAdd(33), IAdd(53) all separately add c₄ F₇, IAdd(13), IAdd(25), IAdd(43), IAdd(63) all separately subtract c₄ F₇, IAdd (16), IAdd (21), IAdd (25), IAdd (29), IAdd (47), IAdd(48), IAdd(49), IAdd(51) all separately add F₂₁, IAdd (9), IAdd (11), IAdd (23), IAdd (33), IAdd (37), IAdd(45), IAdd(59), IAdd(63) all separately subtract F₂₁, IAdd (8), IAdd (16), IAdd (25), IAdd (31), IAdd (33), IAdd(39), IAdd(57), IAdd(63) all separately add F₂₈, and IAdd (9), IAdd (15), IAdd (17), IAdd (23), IAdd (41), IAdd(47), IAdd(49), IAdd(55) all separately subtract F₂₈.

During cycle 17, F₄₂ and F₃₅ are provided to the input bus. Additionally, IAdd(8), IAdd(13), IAdd(16), IAdd(31), IAdd(33), IAdd(43), IAdd(45), IAdd(51) all separately add F₃₅, IAdd(11), IAdd(19), IAdd(21), IAdd(25), IAdd(39), IAdd(53), IAdd(57), IAdd(63) all separately subtract. F₃₅, IAdd(9), IAdd(16), IAdd(21), IAdd(45), IAdd(48), IAdd(49), IAdd(59), IAdd(63) all separately add F₄₂, and IAdd(13), IAdd(17), IAdd(27), IAdd(31), IAdd(35), IAdd(39), IAdd(41), IAdd(53) all separately subtract F₄₂.

During cycle 18, F₅₆ and F₄₉ are provided to the input bus and F₅₆ is multiplied by c₄. Additionally, IAdd (11), IAdd (16), IAdd (17), IAdd (25), IAdd (32), IAdd(35), IAdd(45), IAdd(55) all separately add F₄₉, and IAdd (15), IAdd (21), IAdd (27), IAdd (33), IAdd (41), IAdd(51), IAdd(62), IAdd(63) all separately subtract F₄₉.

During cycle 19, F₅₀ and F₅₇ are provided to the input bus. Additionally, IAdd (1), IAdd (3), IAdd (20), IAdd(40), IAdd(44), IAdd(60) all separately add F₅₀, IAdd(4), IAdd(6), IAdd(12), IAdd(28), IAdd(36), IAdd(52) all separately subtract F₅₀, IAdd(15), IAdd(16), IAdd (23), IAdd (31), IAdd (39), IAdd (47), IAdd (55), IAdd(63) all separately add c₄ F₅₆, IAdd(7), IAdd(12), IAdd(18), IAdd(24), IAdd(34), IAdd(44), IAdd(54) all separately add F₅₇, and IAdd (5), IAdd (14), IAdd (20), IAdd(26), IAdd(42), IAdd(52), IAdd(61) all separately subtract F₅₇.

During cycle 20, F₃₆ and F₄₃ are provided to the input bus. Additionally, IAdd(0), IAdd(2), IAdd(5), IAdd(7) all separately add F₃₆, IAdd(1), IAdd(3), IAdd(4), IAdd(6) all separately subtract F₃₆, IAdd(5), IAdd(12), IAdd(30), IAdd(44), IAdd(46), IAdd(50), IAdd(56) all separately add F₄₃, and IAdd(7), IAdd(10), IAdd(20), IAdd(22), IAdd(38), IAdd(52), IAdd(58) all separately subtract F₄₃.

During cycle 21, F₂₉ and F₂₂ are provided to the input bus. Additionally, IAdd(4), IAdd(6), IAdd(20), IAdd (40), IAdd (44), IAdd (60) all separately add F₂₂, IAdd(1), IAdd(3), IAdd(12), IAdd(28), IAdd(36), IAdd(52) all separately subtract F₂₂, IAdd(7), IAdd(20), IAdd(30), IAdd(46), IAdd(50), IAdd(52), IAdd(56) all separately add F₂₉, IAdd(5), IAdd(10), IAdd(12), IAdd(22), IAdd(38), IAdd(44), IAdd(58) all separately subtract F₂₉.

During cycle 22, F₁₅ and F₂₃ are provided to the input bus. Additionally, IAdd(5), IAdd(18), IAdd(20), IAdd (24), IAdd (34), IAdd (52), IAdd (54) all separately add F₁₅ and IAdd (7), IAdd (12), IAdd (14), IAdd (26), IAdd(42), IAdd(44), IAdd(61) all separately subtract F₁₅.

During cycle 23, F₃₀ and F₃₇ are provided to the input bus. Additionally, IAdd(17), IAdd(21), IAdd(32), IAdd(33), IAdd(35), IAdd(51), IAdd(55), IAdd(63) all separately add F₂₃, IAdd(11), IAdd(15), IAdd(16), IAdd(25), IAdd(27), IAdd(41), IAdd(45), IAdd(62) all separately subtract F₂₃, IAdd(17), IAdd(21), IAdd(31), IAdd(39), IAdd(41), IAdd(45), IAdd(48), IAdd(59) all separately add F₃₀, IAdd(9), IAdd(13), IAdd(16), IAdd(27), IAdd(35), IAdd(49), IAdd(53), IAdd(63) all separately subtract F₃₀, IAdd(8), IAdd(19), IAdd(25), IAdd(31), IAdd(45), IAdd(51), IAdd(53), IAdd(63) all separately add F₃₇, and IAdd(11), IAdd(13), IAdd(16), IAdd(21), IAdd(33), IAdd(39), IAdd(43), IAdd(57) all separately subtract F₃₇.

During cycle 24, F₄₄ and F₅₁ are provided to the input bus. Additionally, IAdd(8), IAdd(15), IAdd(23), IAdd(25), IAdd(33), IAdd(47), IAdd(55), IAdd(57) all separately add F₄₄, IAdd(9), IAdd(16), IAdd(17), IAdd(31), IAdd(39), IAdd(41), IAdd(49), IAdd(63) all separately subtract F₄₄, IAdd(11), IAdd(29), IAdd(33), IAdd(45), IAdd(47), IAdd(48), IAdd(49), IAdd(63) all separately add F₅₁, IAdd(9), IAdd(16), IAdd(21), IAdd(23), IAdd(25), IAdd(37), IAdd(51), IAdd(59) all separately subtract F₅₁.

During cycle 25 F₅₈ and F₅₉ are provided to the input bus. Additionally, IAdd (9), IAdd (19), IAdd (31), IAdd(32), IAdd(39), IAdd(43), IAdd(49), IAdd(62) all separately add F₅₈, IAdd(11), IAdd(16), IAdd(17), IAdd(29), IAdd(37), IAdd(41), IAdd(51), IAdd(63) all separately subtract F₅₈, IAdd(3), IAdd(10), IAdd(28), IAdd(34), IAdd(40), IAdd(46), IAdd(61) all separately add F₅₉, and IAdd (1), IAdd (22), IAdd (24), IAdd (26), IAdd(36), IAdd(50), IAdd(60) all separately subtract F₅₉.

During cycle 26, F₅₂ and F₄₅ are provided to the input bus. Additionally, IAdd(0), IAdd(18), IAdd(26), IAdd(44), IAdd(52), IAdd(54), IAdd(61) all separately add F₄₅, IAdd (2), IAdd (12), IAdd (14), IAdd (20), IAdd (24), IAdd(34), IAdd(42) all separately subtract F₄₅, IAdd(14), IAdd(22), IAdd(26), IAdd(34), IAdd(46), IAdd(54), IAdd(56), IAdd(58) all separately add F₅₂, IAdd(10), IAdd (18), IAdd (24), IAdd (30), IAdd (38), IAdd (42), IAdd(50), IAdd(61) all separately subtract F₅₂. Additionally, IAdd(16) provides τ₁₆ to Sbus(5) and IAdd(63) provides τ₆₃ to Sbus(4).

During cycle 27, F₃₈ and F₃₁ are provided to the input bus. Additionally, IAdd(6), IAdd(22), IAdd(34), IAdd(36), IAdd(40), IAdd(50), IAdd(61) all separately add F₃₁, IAdd (4), IAdd (10), IAdd (24), IAdd (26), IAdd (28), IAdd(46), IAdd(60) all separately subtract F₃₁, IAdd(18), IAdd (22), IAdd (30), IAdd (38), IAdd (42), IAdd (46), IAdd(56), IAdd(58) all separately add F₃₈, and IAdd(10), IAdd(14), IAdd(24), IAdd(26), IAdd(34), IAdd(50), IAdd (54), IAdd (61) all separately subtract F₃₈. Additionally, Cmult(1) provides c₁ ·τ₆₃ to Pbus(6), Rmult(1) provides c₁ ·τ₁₆ to Pbus(5), Cmult(3) provides c₃ ·τ₆₃ to Pbus(4), Cmult(5) provides c₅ ·τ₁₆ to Pbus(3), Rmult(5) provides c₅ ·τ₆₃ to Pbus(2), Cmult(7) provides c₇ ·τ₆₃ to Pbus(1), and Rmult(7) provides c₇ ·τ₁₆ to Pbus(0).

During cycle 28, F₃₉ and F₄₆ are provided to the input bus. Additionally, IAdd(15), IAdd(17), IAdd(23), IAdd(35), IAdd(37), IAdd(48), IAdd(49), IAdd(62) all separately add F₃₉, IAdd(9), IAdd(27), IAdd(29), IAdd(32), IAdd(41), IAdd(47), IAdd(55), IAdd(59) all separately subtract F₃₉, IAdd (8), IAdd (19), IAdd (23), IAdd(29), IAdd(37), IAdd(43), IAdd(47), IAdd(57) all separately add F₄₆, and IAdd(11), IAdd(15), IAdd(25), IAdd(32), IAdd(33), IAdd(51), IAdd(55), IAdd(62) all separately subtract F₄₆. Also, IAdd(24) provides τ₂₄ to Sbus(5), IAdd(26) provides τ₂₆ to Sbus(4), IAdd(34) provides τ₃₄ to Sbus(3), and IAdd(61) provides τ₆₁ to Sbus (2). Also, Cmult (3) provides c₃ ·τ₁₆ to Pbus (6).

During cycle 29, F₅₃ and F₆₀ are provided to the input bus. Additionally, IAdd(8), IAdd(17), IAdd(27), IAdd(39), IAdd(43), IAdd(53), IAdd(55), IAdd(62) all separately add F₅₃, IAdd(13), IAdd(15), IAdd(19), IAdd(31), IAdd(32), IAdd(35), IAdd(41), IAdd(57) all separately subtract F₅₃, IAdd(13), IAdd(21), IAdd(27), IAdd(35), IAdd(45), IAdd(48), IAdd(53), IAdd(59) all separately add F₆₀, and IAdd(11), IAdd(19), IAdd(29), IAdd(32), IAdd(37), IAdd(43), IAdd(51), IAdd(62) all separately subtract F₆₀. Also, Cmult (2) provides c₂ ·τ₃₄ to Pbus(6) and Cmult(6) provides c₆ ·τ₂₄ to Pbus(5).

During cycle 30, F₆₁ and F₅₄ are provided to the input bus. Additionally, IAdd(0), IAdd(2), IAdd(20), IAdd(28), IAdd(36), IAdd(44) all separately add F₅₄, IAdd(5), IAdd(7), IAdd(12), IAdd(40), IAdd(52), IAdd(60) all separately subtract F₅₄, IAdd(6), IAdd(28), IAdd(38), IAdd(42), IAdd(54), IAdd(56), IAdd(60) all separately add F₆₁, and IAdd(4), IAdd(14), IAdd(18), IAdd(30), IAdd(36), IAdd(40), IAdd(58) all separately subtract F₆₁. Also, IAdd(11) provides τ₁₁ to Sbus(5), IAdd(17) provides τ₁₇ to Sbus(4), IAdd(32) provides τ₃₂ to Sbus(3), IAdd(41) provides τ₄₁ to Sbus(2), IAdd(51) provides τ₅₁ to Sbus(1), and IAdd(62) provides τ₆₂ to Sbus(0). Also, Cmult(2) provides c₂ ·τ₂₆ to Pbus(6) and Cmult(6) provides c₆ ·τ₆₁ to Pbus (5).

During cycle 31, F₄₇ and F₅₅ are provided to the input bus. Additionally, IAdd (3), IAdd (14), IAdd (18), IAdd(36), IAdd(38), IAdd(56), IAdd(60) all separately add F₄₇, IAdd(1), IAdd(28), IAdd(30), IAdd(40), IAdd(42), IAdd(54), IAdd(58) all separately subtract F₄₇, IAdd(8), IAdd(13), IAdd(19), IAdd(37), IAdd(39), IAdd(47), IAdd(49), IAdd(59) all separately add F₅₅, and IAdd(9), IAdd(23), IAdd(29), IAdd(31), IAdd(43), IAdd(48), IAdd (53), IAdd (57) all separately subtract F₅₅. Also, IAdd(4) provides τ₄ to Sbus(5), IAdd(5) provides τ₅ to Sbus(4), IAdd(6) provides τ₆ to Sbus(3), and IAdd(7) provides τ₇ to Sbus(2). Also, Cmult(1) provides c₁ ·τ₅₁ to Pbus (6), Rmult (1) provides c₁ ·τ₃₂ to Pbus (5), Cmult (2) provides c₂ ·τ₆₁ to Pbus (4), Cmult (3) provides c₃ ·τ₅₁ to Pbus(3), Cmult(5) provides c₅ ·τ₅₁ to Pbus(2), Cmult(7) provides c₇ ·τ₅₁ to Pbus(1) and Rmult(7) provides c₇ ·τ₄₁ to Pbus (0).

During cycle 32, F₆₂ and F₆₃ are provided to the input bus. Additionally, IAdd(8), IAdd(15), IAdd(21), IAdd(27), IAdd(35), IAdd(45), IAdd(55), IAdd(57) all separately add F₆₂, IAdd (13), IAdd (23), IAdd (25), IAdd(33), IAdd(47), IAdd(48), IAdd(53), IAdd(59) all separately subtract F₆₂, IAdd(0), IAdd(12), IAdd(20), IAdd(38), IAdd(46), IAdd(50), IAdd(58) all separately add F₆₃, and IAdd (2), IAdd (10), IAdd (22), IAdd (30), IAdd(44), IAdd(52), IAdd(56) all separately subtract F₆₃. Also, IAdd(37) provides τ₃₇ to Sbus(5) and IAdd(54) provides τ₅₄ to Sbus(4). Also, Cmult(0) provides c₀ ·τ₅ on Pbus(6), Cmult(1) provides c₁ ·τ₁₇ on Pbus(5), Rmult(1) provides c₁ ·τ₆₂ on Pbus(4), Cmult(2) provides c₂ ·τ₂₄ to Pbus (3), Cmult (3) provides c₃ ·τ₄₁ to Pbus (2), Cmult (5) provides c₅ ·τ₄₁ to Pbus (1), and Cmult (7) provides c₇ ·τ₃₂ to Pbus (0)

During cycle 33, IAdd(10) provides τ₁₀ to Sbus(5), IAdd(14) provides τ₁₄ to Sbus(4), IAdd(22) provides τ₂₂ to Sbus(3), IAdd(40) provides τ₄₀ to Sbus(2), IAdd(50) provides τ₅₀ to Sbus(1), and IAdd(52) provides τ₅₂ to Sbus (0). Also, Cmult (0) provides c₀ ·T₇ to Pbus (6), Cmult (2) provides c₂ ·τ₅₄ to Pbus (5), Cmult (3) provides c₃ ·τ₁₁ to Pbus(4), Rmult(3) provides c₃ ·τ₃₂ to Pbus(3), Cmult(5) provides c₅ ·τ₁₁ to Pbus(2), Cmult(6) provides c₆ ·τ₅₄ to Pbus(1), and Cmult(7) provides c₇ ·τ₁₇ to Pbus(0).

During cycle 34, IAdd (9) provides τ₉ to Sbus (5), IAdd (20) provides τ₂₀ to Sbus (4), IAdd (23) provides τ₂₃ to Sbus(3), IAdd(47) provides τ₄₇ to Sbus(2), and IAdd(49) provides τ₄₉ to Sbus(1). Also, Cmult(1) provides c₁ ·τ₃₇ to Pbus(6), Rmult(1) provides c₁ ·τ₄₁ to Pbus(5), Cmult(2) provides c₂ ·τ₂₂ to Pbus (4), Cmult (4) provides c₄ ·τ₄₀ to Pbus(3), Cmult(5) provides c₅ ·τ₃₇ to Pbus(2), Cmult(6) provides c₆ ·τ₂₆ to Pbus (1), and Cmult (7) provides c₇ ·τ₁₁ to Pbus (0).

During cycle 35, IAdd(19) provides τ₁₉ to Sbus(5), IAdd (27) provides τ₂₇ to Sbus (4), IAdd (36) provides τ₃₆ to Sbus(3), IAdd(43) provides τ₄₃ to Sbus(2), IAdd(55) provides τ₅₅ to Sbus(1) and IAdd(60) provides τ₆₀ to Sbus(0). Additionally, Cmult(1) provides c₁ ·τ₉ to Pbus(6), Cmult(2) provides c₂ ·τ₁₀ to Pbus(5), Cmult(3) provides c₃ ·τ₂₃ to Pbus (4), Rmult (3) provides c₃ ·τ₆₂ to Pbus(3), Cmult(5) provides c₅ ·τ₁₇ to Pbus(2), Cmult(6) provides c₆ ·τ₁₀ to Pbus(1), and Cmult(7) provides c₇ ·τ₂₃ to Pbus (0)

During cycle 36, IAdd(15) provides τ₁₅ to Sbus(5), IAdd(21) provides τ₂₁ to Sbus(4), IAdd(28) provides τ₂₈ to Sbus(3), IAdd(30) provides τ₃₀ to Sbus(2), IAdd(45) provides τ₄₅ to Sbus(1) and IAdd(59) provides τ₅₉ to Sbus(0). Additionally, Cmult(1) provides c₁ ·τ₄₇ to Pbus(6), Cmult(3) provides c₃ ·τ₂₇ to Pbus(5), Cmult(4) provides c₄ ·τ₅₂ to Pbus(4), Cmult(5) provides c₅ ·τ₄₇ to Pbus(3), Rmult(5) provides c₅ ·τ₁₉ to Pbus(2), Cmult(7) provides c₇ ·τ₃₇ to Pbus(1), and Rmult(7) provides c₇ ·τ₂₇ to Pbus(0),

During cycle 37, IAdd(3) provides τ₃ to Sbus(5), IAdd(13) provides τ₁₃ to Sbus(4), IAdd(25) provides τ₂₅ to Sbus(3), IAdd(48) provides τ₄₈ to Sbus(2), IAdd(57) provides τ₅₇ to Sbus(1) and IAdd(53) provides τ₅₃ to Sbus(0). Additionally, Cmult(1) provides c₁ ·τ₄₅ to Pbus(6), Rmult(1) provides c₁ ·τ₄₃ to Pbus(5), Cmult(3) provides c₃ ·τ₄₉ to Pbus (4), Cmult (5) provides c₅ ·τ₄₉ to Pbus(3), Rmult(5) provides c₅ ·τ₁₅ to Pbus(2), Cmult(6) provides c₆ ·τ₃₀ to Pbus (1), and Cmult (7) provides c₇ ·τ₉ to Pbus (0).

During cycle 38, IAdd(12) provides τ₁₂ to Sbus(5), IAdd(44) provides τ₄₄ to Sbus(4), IAdd(56) provides τ₅₆ to Sbus(3) and IAdd(58) provides τ₅₈ to Sbus(2). Additionally, Cmult(1) provides c₁ ·τ₄₉ to Pbus(6), Cmult(3) provides c₃ ·τ₅₅ to Pbus(5), Rmult(3) provides c₃ ·τ₄₅ to Pbus (4), Cmult (4) provides c₄ ·τ₃₆ to Pbus (3), Cmult (5) provides c₅ ·τ₄₈ to Pbus (2), Rmult (5) provides c₅ ·τ₁₃ to Pbus(1), and Cmult(7) provides c₇ ·τ₁₅ to Pbus(0).

During cycle 39, IAdd(0) provides τ₀ to Sbus(5), IAdd (1) provides τ₁ to Sbus (4), IAdd (2) provides τ₂ to Sbus(3), IAdd(29) provides τ₂₉ to Sbus(2), and IAdd(35) provides τ₃₅ to Sbus(1). Additionally, Cmult(3) provides c₃ ·τ₁₉ to Pbus (6), Rmult (3) provides c₃ ·τ₉ to Pbus (5), Cmult (4) provides c₄ ·τ₆₀ to Pbus (4), Cmult (5) provides c₅ ·τ₄₃ to Pbus(3), Rmult(5) provides c₅ ·τ₂₇ to Pbus(2), Cmult(6) provides c₆ ·τ₃₄ to Pbus(1), and Cmult(7) provides c₇ ·τ₅₅ to Pbus(0).

During cycle 40, IAdd(8) provide τ₈ to Sbus(5), IAdd(18) provide τ₁₈ to Sbus(4), IAdd(33) provide τ₃₃ to Sbus(3), and IAdd(46) provide τ₄₆ to Sbus(2). Additionally, Cmult(2) provides c₂ ·τ₅₆ to Pbus(6), Cmult (3) provides c₃ ·τ₂₁ to Pbus (5), Rmult (3) provides c₃ ·τ₃₅ to Pbus (4), Cmult (4) provides c₄ ·τ₄₄ to Pbus (3), Cmult (5) provides c₅ ·τ₂₅ to Pbus (2), Cmult (7) provides c₇ ·τ₅₉ to Pbus(1) and Rmult(7) provides c₇ ·τ₄₇ to Pbus(0).

During cycle 41, IAdd (31) provides τ₃₁ to Sbus (5), IAdd(38) provides τ₃₈ to Sbus(4), IAdd(39) provides τ₃₉ to Sbus (3), and IAdd (42) provides τ₄₂ to Sbus (2). Additionally, Cmult(0) provides c₀ ·τ₆ to Pbus(6), Cmult (1) provides c₁ ·τ₂₃ to Pbus (5), Cmult (3) provides c₃ ·τ₅₉ to Pbus (4), Cmult (5) provides c₅ ·τ₂₁ to Pbus (3), Rmult (5) provides c₅ ·τ₅₅ to Pbus (2), Cmult (6) provides c₆ ·τ₅₆ to Pbus(1), and Cmult(7) provides c₇ ·τ₂₉ to Pbus(0).

During cycle 42, Cmult(1) provides c₁ ·τ₁₅ to Pbus(6), Rmult(1) provides c₁ ·τ₅₃ to Pbus(5), Cmult(4) provides c₄ ·τ₂₈ to Pbus (4), Cmult (2) provides c₂ ·τ₃₈ to Pbus (3), Cmult(5) provides c₅ ·τ₆₂ to Pbus(2), Cmult(7) provides c₇ ·τ₂₁ to Pbus(1), and Rmult(7) provides c₇ ·τ₈ to Pbus(0).

During cycle 43, Cmult (0) provides c₀ ·τ₀ to Pbus (6), Cmult(1) provides c₁ ·τ₃₃ to Pbus(5), Rmult(1) provides c₁ ·τ₃₅ to Pbus (4), Cmult (3) provides c₃ ·τ₂₅ to Pbus (3), Cmult (5) provides c₅ ·τ₅₉ to Pbus (2), Cmult (6) provides c₆ ·τ₄₂ to Pbus (1), and Cmult (7) provides c₇ ·τ₄₈ to Pbus (0).

During cycle 44, Cmult(0) provides c₀ ·τ₄ to Pbus(6), Cmult(1) provides c₁ ·τ₂₅ to Pbus(5), Cmult(2) provides c₂ ·τ₄₆ to Pbus(4), Cmult(3) provides c₃ ·τ₅₃ to Pbus(3), Rmult(3) provides c₃ ·τ₁₃ to Pbus(2), Cmult(5) provides c₅ ·τ₈ to Pbus (1), and Cmult (6) provides c₆ ·τ₃₈ to Pbus (0).

During cycle 45, Cmult(1) provides c₁ ·τ₃₉ to Pbus(6), Rmult (1) provides c₁ ·τ₃₁ to Pbus (5), Cmult (2) provides c₂ ·τ₅₈ to Pbus(4), Cmult(3) provides c₃ ·τ₃₉ to Pbus(3), Rmult(4) provides c₄ ·τ₁₂ to Pbus(2), Cmult(7) provides c₇ ·τ₄₅ to Pbus(1), and Rmult(7) provides c₇ ·τ₂₅ to Pbus(0).

During cycle 46, Cmult(1) provides c₁ ·τ₂₉ to Pbus(6), Cmult (3) provides c₃ ·τ₂₉ to Pbus (5), Rmult (3) provides c₃ ·τ₁₅ to Pbus(4), Cmult(5) provides c₅ ·τ₄₅ to Pbus(3), Rmult (5) provides c₅ ·τ₉ to Pbus (2), Cmult (7) provides c₇ ·τ₃₁ to Pbus(1), and Rmult(7) provides c₇ ·τ₄₃ to Pbus(0).

During cycle 47, Cmult(1) provides c₁ ·τ₅₇ to Pbus(6), Rmult(1) provides c₁ ·τ₁₃ to Pbus(5), Cmult(2) provides c₂ ·τ₄₂ to Pbus (4), Cmult (3) provides c₃ ·τ₈ to Pbus (3), Rmult(3) provides c₃ ·τ₄₃ to Pbus(2), Cmult(4) provides c₄ ·τ₂₀ to Pbus(1), and Cmult(7) provides c₇ ·τ₅₇ to Pbus(0).

During cycle 48, Cmult(1) provides c₁ ·τ₁₁ to Pbus(6), Cmult(2) provides c₂ ·τ₅₀ to Pbus(5), Cmult(3) provides c₃ ·τ₃₁ to Pbus (4), Rmult (3) provides c₃ ·τ₁₇ to Pbus (3), Cmult(5) provides c₅ ·τ₃₉ to Pbus(2), Cmult(7) provides c₇ ·τ₆₂ to Pbus(1), and Rmult(7) provides c₇ ·τ₄₉ to Pbus(0).

During cycle 49, Cmult(0) provides c₀ ·τ₂ to Pbus(6), Cmult(1) provides c₁ ·τ₅₅ to Pbus(5), Cmult(2) provides c₂ ·τ₁₈ to Pbus(4), Cmult(5) provides c₅ ·τ₂₃ to Pbus(3), Rmult(5) provides c₅ ·τ₅₇ to Pbus(2), Cmult(6) provides c₆ ·τ₅₈ to Pbus (1), and Cmult (7) provides c₇ ·τ₅₃ to Pbus (0).

During cycle 50, Cmult(0) provides c₀ ·τ₁ to Pbus(6), Rmult(1) provides c₁ ·τ₄₈ to Pbus(5), Cmult(1) provides c₁ ·τ₅₉ to Pbus(4), Cmult(2) provides c₂ ·τ₁₄ to Pbus(3), Cmult(3) provides c₃ ·τ₅₇ to Pbus(2), Rmult(3) provides c₃ ·τ₄₇ to Pbus (1), and Cmult (6) provides c₆ ·τ₂₂ to Pbus (0).

During cycle 51, Cmult(0) provides c₀ ·τ₃ to Pbus(6), Cmult(1) provides c₁ ·τ₂₁ to Pbus(5), Cmult(3) provides c₃ ·τ₃₇ to Pbus(4), Cmult(5) provides c₅ ·τ₃₅ to Pbus(3), Rmult(5) provides c₅ ·τ₃₁ to Pbus(2), Cmult(6) provides c₆ ·τ₁₈ to Pbus(1), and Cmult(7) provides c₇ ·τ₁₉ to Pbus(0).

During cycle 52, Cmult(2) provides c₂ ·τ₃₀ to Pbus(6), Cmult (3) provides c₃ ·τ₄₈ to Pbus (5), Cmult (5) provides c₅ ·τ₃₃ to Pbus(4), Rmult(5) provides c₅ ·τ₃₂ to Pbus(3), Cmult(6) provides c₆ ·τ₅₀ to Pbus(2), Cmult(7) provides c₇ ·τ₃₉ to Pbus(1), and Rmult(7) provides c₇ ·τ₃₃ to Pbus(0).

During cycle 53, Cmult(1) provides c₁ ·τ₁₉ to Pbus(6), Rmult(1) provides c₁ ·τ₂₇ to Pbus(5), Cmult(5) provides c₅ ·τ₂₉ to Pbus(4), Rmult(5) provides c₅ ·τ₅₃ to Pbus(3), Cmult(6) provides c₆ ·τ₁₄ to Pbus(2), Cmult(7) provides c₇ ·τ₁₃ to Pbus(1), and Rmult(7) provides c₇ ·τ₃₅ to Pbus(0).

During cycle 54, Cmult (1) provides c₁ ·τ₈ to Pbus (2), Cmult (3) provides c₃ ·τ₃₃ to Pbus (1), and Cmult (6) provides c₆ ·τ₄₆ to Pbus(0).

During cycle 55, OAdd (0) and OAdd (1) provide f₀ and f₁ to the output bus. During cycle 56, provide OAdd(2) and OAdd(3) place f₂ and f₃ to the output bus. During cycle 57, OAdd(4) and 0Add(5) provide f₄ and f₅ to the output bus.

During cycle 58, OAdd (6) and OAdd (7) provide f₆ and f₇ to the output bus. During cycle 59, OAdd(8) and OAdd(9) provide f₈ and f₉ to the output bus. During cycle 60, OAdd(10) and OAdd(11) provide f₁₀ and f₁₁ to the output bus. During cycle 61, OAdd (12) and OAdd (13) provide f₁₂ and f₁₃ to the output bus. During cycle 62, OAdd (14) and OAdd (15) provide f₁₄ and f₁₅ to the output bus. During cycle 63, OAdd(16) and OAdd(17) provide and f₁₇ to the output bus. During cycle 64, OAdd(18) and OAdd(19) provide f₁₈ and f₁₉ to the output bus. During cycle 65, OAdd(20) and OAdd(21) provide f₂₀ and f₂₁ to the output bus. During cycle 66, OAdd(22) and OAdd(23) provide f₂₂ and f₂₃ to the output bus. During cycle 67, OAdd (24) and OAdd (25) provide f₂₄ and f₂₅ to the output bus. During cycle 68, OAdd(26) and OAdd(27) provide f₂₆ and f₂₇ to the output bus. During cycle 69, OAdd(28) and OAdd(29) provide f₂₈ and f₂₉ to the output bus. During cycle 70, OAdd (30) and OAdd (31) provide f₃₀ and f₃₁ to the output bus. During cycle 71, OAdd(32) and OAdd(33) provide f₃₂ and f₃₃ to the output bus. During cycle 72, OAdd (34) and OAdd (35) provide f₃₄ and f₃₅ to the output bus. During cycle 73, OAdd(36) and OAdd(37) provide f₃₆ and f₃₇ to the output bus. During cycle 74, OAdd(38) and OAdd(39) provide f₃₈ and f₃₉ to the output bus. During cycle 75, OAdd (40) and OAdd (41) provide f₄₀ and f₄₁ to the output bus. During cycle 76, OAdd(42) and OAdd(43) provide f₄₂ and f₄₃ to the output bus. During cycle 77, OAdd(44) and OAdd(45) provide f₄₄ and f₄₅ to the output bus. During cycle 78, OAdd(46) and OAdd(47) provide f₄₆ and f₄₇ to the output bus. During cycle 79, OAdd(48) and OAdd(49) provide f₄₈ and f₄₉ to the output bus. During cycle 80, OAdd(50) and OAdd(51) provide f₅₀ and f₅₁ to the output bus. During cycle 81, OAdd(52) and OAdd(53) provide f₅₂ and f₅₃ to the output bus. During cycle 82, OAdd (54) and OAdd (55) provide f₅₄ and f₅₅ to the output bus. During cycle 83, OAdd(56) and OAdd(57) provide f₅₆ and f₅₇ to the output bus. During cycle 84, OAdd(58) and OAdd(59) provide f₅₈ and f₅₉ to the output bus. During cycle 85, OAdd(60) and OAdd(61) provide f₆₀ and f₆₁ to the output bus. During cycle 86, OAdd(62) and OAdd(63) provide f₆₂ and f₆₃ to the output bus.

As with the forward transform, pipelining is used when performing the IDCT to allow more than one transform to be performed every 88 cycles. More specifically, after cycle 32, the input accumulators have finished with a first matrix, so that a second matrix can be provided to the input bus starting at cycle 33. Likewise at cycle 65, multiplication circuits have finished with the first matrix and so the second matrix may be provided to the multiplication circuits. Additionally, because the input accumulators are finished with the second matrix, a third matrix may be provided to the input bus.

Referring to FIG. 3, an alternate DCT engine, which functions as a peripheral device for a computer, is shown. DCT engine 50 includes a plurality of input accumulator circuits 52 and a plurality of output accumulator circuits 54. Input accumulator circuits 52 and output accumulator circuits 54 receive input data from and provide sum data to Input/Output (I/O) bus 60. I/O bus 60 is also coupled with a standard processor 62 of a computer system.

Input accumulator circuits 52 are arranged in an 8×8 matrix. For notation purposes, these accumulator circuits are referred to as IAdd(L), where L indicates the location of the adder circuit in the matrix which is indexed to correspond to a singly indexed vector.

Output accumulator circuits 54 are arranged as an 8×8 matrix. For notation purposes, these accumulator circuits are referred to as OAdd(L), where L indicates the location of the accumulator circuit in the matrix which is indexed to correspond to a singly indexed vector.

When performing a FDCT and a IDCT, the accumulation times account for a substantial percentage of the time required in performing the DCTs. Accordingly, by providing the matrices of input and output accumulator circuits, DCT engine 50 significantly increases the speed with which a DCT is performed when compared to performing the DCT solely using software with a computer system's processor.

Referring to FIG. 4, every input accumulator circuit 50 and every output accumulator circuit 52 includes arithmetic unit 70 and address selection unit 72. Arithmetic unit 70 is coupled with the data bus portion of I/O bus 60. Arithmetic unit 70 also receives a plurality of control signals from address selection unit 72; these control signals include an add signal (+), a subtract signal (-), a clear signal and a write signal. In addition to providing the control signals to arithmetic unit 70, address selection unit 72 is coupled to the address bus portion of I/O bus 60 and receives a plurality of control signals from the I/O bus; these control signals include a read/write (R/W) signal and a select signal.

Referring to FIGS. 3 and 4, input accumulator circuits 52 and output accumulator circuits 54 are memory mapped to occupy 704 locations in an address space. More specifically, input accumulator circuits 52 are accessed using 64 input accumulator write addresses (II₀ to II₆₃), input accumulator circuits 52 are accessed using input accumulator read addresses (IC₀ to IC₆₃ ), output accumulator circuits 54 are written using 512 (8 *64) output accumulator write addresses (OI₀ ⁰ to OI₆₃ ⁷) and output accumulator circuits 54 are accessed using output accumulator write addresses (OC₀ to OC₆₃). Two additional addresses, an input clear address (ICA) and an output clear address (OCA) are used to generate clear signals to the input accumulator circuits and the output accumulator circuits, respectively.

The operation of DCT engine 50 is similar to that of DCT engine 20. I.e., a plurality of transform coefficients are generated, a plurality of transform products are generated by multiplying the transform coefficients by transform constants and the plurality of transform products are accumulated to produce the transformed matrix. DCT engine 50 differs from DCT engine 20 in the way that the accumulators are accessed and in that DCT engine 50 does not include multiplication circuits. As the theory of the transform and the operation of the accumulators is similar to that of DCT engine 20, a description of the transform of location 0 of the matrix is sufficient to describe the operation of DCT engine 50.

When initiating a DCT, the processor clears all input accumulator circuits by writing to the ICA address and clears all output accumulator circuits by writing to the OCA address. Next, the processor writes the pixel matrix to the appropriate addresses II₀ to II₆₃. While this is occurring, IAdd(0)-IAdd(63) form the transform coefficients σ₀ -σ₆₃. The processor then reads the transform coefficient σ₀ from IAdd(0) via the address IC₀. Using this transform coefficient in combination with the other transform coefficients as well as the transform constants, the processor which is coupled to I/O bus 60 generates the transform product values. These product values are written to the appropriate addresses to be read by the appropriate output accumulator circuits. For example, c₇ ·σ₆₃ is written to address OI₆₃ ⁷. Accordingly, address selection unit 72 for the output accumulation circuits which require this transform product value access this address to add or subtract this value. The transformed output for location 0, which is produced by OAdd(0) is then accessed by the computer processor via address OC₀.

OTHER EMBODIMENTS

Other embodiments are within the following claims.

For example, a DCT engine may be arranged as a peripheral device which only includes the input accumulators as this is where the majority of the time for the transform is required. In this embodiment, the multiplication and the output accumulations are performed by a processor which is coupled to the DCT engine. Although this embodiment does not compute the transforms as quickly as the embodiment which includes both input and output accumulators, this embodiment requires only half the hardware of the embodiment which includes both input and output accumulators. 

What is claimed is:
 1. An apparatus for performing a discrete cosine transform on an input matrix of data elements to provide a transformed matrix of data elements, the input matrix and the output matrix each having a plurality of row locations and a plurality of column locations, the apparatus comprisinga plurality of input accumulators, the plurality of input accumulators accumulating data elements from the input matrix of data elements in parallel to provide a plurality of transform coefficient outputs, the plurality of input accumulators corresponding to respective row and column locations of the input matrix, each input accumulator providing an individual transform coefficient; a plurality of multiplication circuits coupled with the plurality of input accumulators, the plurality of multiplication circuits receiving the plurality of transform coefficient outputs and multiplying the transform coefficient outputs by transform constants to provide a plurality of transform products; and a plurality of output accumulators coupled with the plurality of multiplication circuits, each output accumulator receiving the transform products and accumulating the products to provide the transformed matrix of data elements.
 2. The apparatus of claim 1 whereinthe input accumulators and the plurality of multiplication circuits are coupled via a transform coefficient bus.
 3. The apparatus of claim 2 whereinthe transform coefficient bus includes a plurality of buses, the plurality of buses being cross coupled so that transform coefficients from any accumulator circuit may be received by any multiplication circuit.
 4. The apparatus of claim 1 whereinthe multiplication circuits and the output accumulators are coupled via a product bus.
 5. The apparatus of claim 4 whereinthe product bus includes a plurality of product buses, the plurality of product buses being cross coupled so that transform products from any multiplication circuit may be received by any output accumulator circuit.
 6. The apparatus of claim 1 whereinthe plurality of output accumulators correspond to the row and column locations of the transformed matrix, and each output accumulator provides output data for an individual location of the transformed matrix of data elements.
 7. The apparatus of claim 1 whereinthe discrete cosine transform uses a discrete plurality of constants, the plurality of multiplication circuits correspond to the discrete plurality of constants, and each multiplication circuit multiplies the plurality of transform coefficients by one of the discrete plurality of constants.
 8. An apparatus for performing a discrete cosine transform on a matrix of data elements to provide a transformed matrix of data elements, each matrix having a plurality of row locations and a plurality of column locations, the apparatus comprisinga plurality of input accumulators, the plurality of input accumulators accumulating the input matrix of data elements in parallel to provide a plurality of transform coefficient outputs, the plurality of input accumulators corresponding to respective row and column locations of the input matrix, each input accumulator providing an individual transform coefficient; means for multiplying, the means for multiplying receiving the plurality of transform coefficient outputs and multiplying the coefficient outputs by constants to provide a plurality of transform products; and means for output accumulation, the means for output accumulation receiving the transform products and adding the transform products to provide the transformed matrix of data elements.
 9. The apparatus of claim 8 whereinthe input accumulators are coupled via an input/output bus.
 10. The apparatus of claim 9 whereinthe discrete cosine transform uses a discrete plurality of constants, the means for multiplying includes a computer processor coupled to the input output bus, the computer processor multiplying the plurality of transform coefficients by the discrete plurality of constants to provide the plurality of transform products.
 11. The apparatus of claim 10 whereinthe means for output accumulation includes the computer processor, the computer processor accumulating the plurality of transform products to provide the transformed matrix of data elements.
 12. The apparatus of claim 9 whereinthe means for output accumulation includes a plurality of output accumulators corresponding to the row and column locations of the transformed matrix, and each output accumulator provides output data element for an individual location of the transformed matrix of data elements.
 13. The apparatus of claim 9 whereinthe discrete cosine transform uses a discrete plurality of constants, the means for multiplying includes a plurality of multiplication circuits corresponding to the discrete plurality of constants, and each multiplication circuit multiplies the plurality of transform coefficients by one of the discrete plurality of constants.
 14. A peripheral apparatus for accelerating arithmetic operations of a processor on a first matrix of first data elements to provide a second matrix of second data elements, the first and second matrices each having a plurality of row locations and a plurality of column locations, each data element of the first matrix of data elements having a first address corresponding thereto and each data element of the second matrix of data elements having a second address corresponding thereto, the apparatus comprisinga bus having a data portion for transmitting the data elements and an address portion for transmitting addresses from the processor; and a plurality of accumulators coupled to the bus and arranged in a matrix corresponding to the first and second matrices of data elements, each of the plurality of accumulators including an arithmetic unit for performing arithmetic operations on the first data elements to provide second data elements, the arithmetic unit being coupled to the data portion of the bus; and an address selection unit for monitoring the bus for particular first addresses, causing the arithmetic unit to perform arithmetic operations on the first data elements corresponding to the particular first addresses in accordance with programmed rules that are determined by the particular first addresses and causing the arithmetic unit to provide the second data elements corresponding to particular second addresses to the data portion of the bus, the address selection unit being coupled to the address portion of the bus and to the arithmetic unit.
 15. The apparatus of claim 14 whereinthe arithmetic unit performs additions and subtractions in accordance with the programmed rules.
 16. The apparatus of claim 14 whereinthe arithmetic operations are performed to provide a discrete cosine transform.
 17. The apparatus of claim 16 whereinthe second matrix of data elements correspond to transform coefficients for use in performing a discrete cosine transform.
 18. A method for performing a discrete cosine transform on a matrix of data to provide a transformed matrix of data, the matrix of data having a plurality of row locationsand a plurality of column locations, the method comprising providing a plurality of input accumulators, the plurality of input accumulators corresponding to respective row and column locations of the input matrix; accumulating the input matrix of data in parallel using the plurality of input accumulators to provide a plurality of transform coefficient outputs; multiplying the transform coefficient outputs by constants to provide a plurality of transform products; and accumulating the transform products to provide the transformed matrix of data. 